说明:双击或选中下面任意单词,将显示该词的音标、读音、翻译等;选中中文或多个词,将显示翻译。
您的位置:首页 -> 词典 -> 重复记录
1)  identify duplicate records
重复记录
1.
In the model,an improved edit distance-based algorithm is proposed to match the strings;attributes matching graph is constructed and twice verification strategy is adopted to identify duplicate records.
这给识别重复记录带来了很大不便,导致传统的去重算法无法达到很好的效果。
2)  repeatable recording
可重复记录
3)  Dual Recordings
双重记录,复式记录
4)  approximately duplicate record
近似重复记录
1.
This paper studied the problem of detecting approximately duplicate records while receiving increments of data with no changes in data schema and matching rule set, and presented an incremental algorithm IACT (Incremental Algorithms based on Clustering Trees for data cleansing).
研究了在数据模式与匹配规则不变的前提下 ,数据集动态增加时近似重复记录的识别问题 ,提出了一种基于聚类树的增量式数据清洗算法IACT 。
2.
Based on this idea, we study the problem for detecting approximately duplicate records while receiving increments of data with no changes in data schema and matching model, and present an incremental algorithm for detecting the records.
介绍了优先队列方法(PriorityQueueStrategy,PQS),并以此为基础,研究了在数据模式与匹配模型不变的前提下,数据源动态增加时近似重复记录识别问题,提出了一种增量式算法IPQS(IncrementalPQS),最后给出了实验结果。
5)  Approximately duplicated records
相似重复记录
1.
It s a hot issue to eliminate approximately duplicated records in data cleansing operation of data warehouse,in which the address information play an important role to identify the same entity.
数据仓库中相似重复记录的识别与消除是数据清洗的热点问题,其中地址类信息对相同实体识别起着非常重要的作用。
2.
Detecting and eliminating approximately duplicated records is one of the main problems needed to be solved for data mining and data quality improvement.
针对检测和消除数据仓库中的相似重复记录问题,提出了数据仓库中的相似重复记录检测方法。
3.
Examining and eliminating approximately duplicated records is one of main problems needed to solve for data cleaning and improving data quality.
检测和消除数据仓库中的相似重复记录是数据清洗和提高数据质量要解决的主要问题之一。
6)  approximately duplicate records
相似重复记录
1.
Research on optimal feature selection method for approximately duplicate records detecting
基于相似重复记录检测的特征优选方法研究
2.
This article presents a synthetic approach for detecting approximately duplicate records.
相似重复记录是数据集成系统中影响数据质量的关键问题之一。
3.
The common approach of marking the approximately duplicate records is that a pair of records are compared in a window with fixed length after these records are indexed by a certain keyword.
要把数据表中的相似重复记录标识出来,常用的方法是先将所有记录按照某个关键字进行索引,然后在一个固定长度的窗口范围内进行记录的两两比对。
补充资料:重复
分子式:
CAS号:

性质:指在相同的实验条件下进行反复测定和测量。重复的目的是估计实验误差,提高平均值的测定精度。它是费歇尔(R.A.Fisher)提出的设计试验的三个基本原则之一,其他两个原则是局部控制与随机化。

说明:补充资料仅用于学习参考,请勿用于其它任何用途。
参考词条