E-resources
Peer reviewed
Open access
-
Araújo, José; Silva, Juan; Costa-Martins, André; Sampaio, Vanderson; Castro, Daniel; Souza, Robson; Giddaluru, Jeevan; Ramos, Pablo; Pita, Robespierre; Barreto, Maurício; Netto, Manoel; Nakaya, Helder
International journal of population data science, 08/2022, Volume: 7, Issue: 3Journal Article
ObjectivePublic health research frequently requires the integration of information from different data sources. However, errors in the records and the high computational costs involved make linking large administrative databases using record linkage (RL) methodologies a major challenge. We present Tucuxi-BLAST, a versatile tool for probabilistic RL that utilizes a DNA-encoded approach to encrypt, analyze and link massive administrative databases. Materials and MethodsTucuxi-BLAST encodes the identification records into DNA. BLASTn algorithm is then used to align the sequences between databases. We tested and benchmarked on a simulated database containing records for 300 million individuals and also on four large administrative databases containing real data on Brazilian patients. ResultsOur method was able to overcome misspellings and typographical errors in administrative databases. In processing the RL of the largest simulated dataset (200k records), the state-of-the art method took 5 days and 7 hours to perform the RL, while Tucuxi-BLAST only took 23 hours. When compared with five existing RL tools applied to a gold-standard dataset from real health-related databases, Tucuxi-BLAST had the highest accuracy and speed. DiscussionBy repurposing genomic tools, researchers are able to perform subject tracing across multiple large epidemiological databases using a regular laptop. ConclusionTucuxi-BLAST can improve data-driven medical research and provide a fast and accurate way to link individual information across several administrative databases.
Shelf entry
Permalink
- URL:
Impact factor
Access to the JCR database is permitted only to users from Slovenia. Your current IP address is not on the list of IP addresses with access permission, and authentication with the relevant AAI accout is required.
Year | Impact factor | Edition | Category | Classification | ||||
---|---|---|---|---|---|---|---|---|
JCR | SNIP | JCR | SNIP | JCR | SNIP | JCR | SNIP |
Select the library membership card:
If the library membership card is not in the list,
add a new one.
DRS, in which the journal is indexed
Database name | Field | Year |
---|
Links to authors' personal bibliographies | Links to information on researchers in the SICRIS system |
---|
Source: Personal bibliographies
and: SICRIS
The material is available in full text. If you wish to order the material anyway, click the Continue button.