Stage:
completed
Fetched:
07 Jun 17:33
Validated:
07 Jun 17:33
Deltas Created
07 Jun 17:33
Units Normalized:
07 Jun 17:37
Ancestry Built:
07 Jun 17:34
Nodes Matched:
07 Jun 17:37
Names Parsed:
07 Jun 17:34
New Models Stored:
07 Jun 17:33
Indexed:
07 Jun 17:37
Completed:
07 Jun 17:39
Time to Harvest:
less than a minute
Harvesting Log
(165 lines)
[INFO] [2021-06-07 17:33:22] Created harvest instance #3999
[STOP] [2021-06-07 17:33:22] create_harvest_instance
[START] [2021-06-07 17:33:22] fetch_files
[STOP] [2021-06-07 17:33:22] fetch_files
[START] [2021-06-07 17:33:22] validate_each_file
[INFO] [2021-06-07 17:33:22] Looping over 4 formats...
[INFO] [2021-06-07 17:33:22] ...refs (/app/public/data/ceram_sea_sp_lis/reference.tab)
[INFO] [2021-06-07 17:33:22] Valid: /app/public/converted_csv/ceram_sea_sp_lis_refs_3999.csv (2 lines)
[INFO] [2021-06-07 17:33:22] ...nodes (/app/public/data/ceram_sea_sp_lis/taxon.tab)
[INFO] [2021-06-07 17:33:22] Valid: /app/public/converted_csv/ceram_sea_sp_lis_nodes_3999.csv (3500 lines)
[INFO] [2021-06-07 17:33:22] ...occurrences (/app/public/data/ceram_sea_sp_lis/occurrence.tab)
[INFO] [2021-06-07 17:33:22] Valid: /app/public/converted_csv/ceram_sea_sp_lis_occurrences_3999.csv (1709 lines)
[INFO] [2021-06-07 17:33:22] ...measurements (/app/public/data/ceram_sea_sp_lis/measurement_or_fact_specific.tab)
[INFO] [2021-06-07 17:33:22] Valid: /app/public/converted_csv/ceram_sea_sp_lis_measurements_3999.csv (3418 lines)
[STOP] [2021-06-07 17:33:22] validate_each_file
[START] [2021-06-07 17:33:22] convert_to_csv
[INFO] [2021-06-07 17:33:22] Looping over 4 formats...
[INFO] [2021-06-07 17:33:22] ...refs (/app/public/data/ceram_sea_sp_lis/reference.tab)
[CMD] [2021-06-07 17:33:22] /usr/bin/sort /app/public/converted_csv/ceram_sea_sp_lis_refs_3999.csv > /app/public/converted_csv/ceram_sea_sp_lis_refs_3999.csv_sorted
[INFO] [2021-06-07 17:33:22] Converted: /app/public/converted_csv/ceram_sea_sp_lis_refs_3999.csv (2 lines)
[INFO] [2021-06-07 17:33:22] ...nodes (/app/public/data/ceram_sea_sp_lis/taxon.tab)
[CMD] [2021-06-07 17:33:22] /usr/bin/sort /app/public/converted_csv/ceram_sea_sp_lis_nodes_3999.csv > /app/public/converted_csv/ceram_sea_sp_lis_nodes_3999.csv_sorted
[INFO] [2021-06-07 17:33:23] Converted: /app/public/converted_csv/ceram_sea_sp_lis_nodes_3999.csv (3500 lines)
[INFO] [2021-06-07 17:33:23] ...occurrences (/app/public/data/ceram_sea_sp_lis/occurrence.tab)
[CMD] [2021-06-07 17:33:23] /usr/bin/sort /app/public/converted_csv/ceram_sea_sp_lis_occurrences_3999.csv > /app/public/converted_csv/ceram_sea_sp_lis_occurrences_3999.csv_sorted
[INFO] [2021-06-07 17:33:23] Converted: /app/public/converted_csv/ceram_sea_sp_lis_occurrences_3999.csv (1709 lines)
[INFO] [2021-06-07 17:33:23] ...measurements (/app/public/data/ceram_sea_sp_lis/measurement_or_fact_specific.tab)
[CMD] [2021-06-07 17:33:23] /usr/bin/sort /app/public/converted_csv/ceram_sea_sp_lis_measurements_3999.csv > /app/public/converted_csv/ceram_sea_sp_lis_measurements_3999.csv_sorted
[INFO] [2021-06-07 17:33:23] Converted: /app/public/converted_csv/ceram_sea_sp_lis_measurements_3999.csv (3418 lines)
[STOP] [2021-06-07 17:33:23] convert_to_csv
[START] [2021-06-07 17:33:23] calculate_delta
[INFO] [2021-06-07 17:33:23] Looping over 4 formats...
[INFO] [2021-06-07 17:33:23] ...refs (/app/public/data/ceram_sea_sp_lis/reference.tab)
[CMD] [2021-06-07 17:33:23] echo "0a" > /app/public/diff/ceram_sea_sp_lis_refs_3999.diff
[CMD] [2021-06-07 17:33:23] tail -n +1 /app/public/converted_csv/ceram_sea_sp_lis_refs_3999.csv >> /app/public/diff/ceram_sea_sp_lis_refs_3999.diff
[CMD] [2021-06-07 17:33:24] echo "." >> /app/public/diff/ceram_sea_sp_lis_refs_3999.diff
[INFO] [2021-06-07 17:33:24] Created diff: /app/public/diff/ceram_sea_sp_lis_refs_3999.diff (4 lines)
[INFO] [2021-06-07 17:33:24] ...nodes (/app/public/data/ceram_sea_sp_lis/taxon.tab)
[CMD] [2021-06-07 17:33:24] echo "0a" > /app/public/diff/ceram_sea_sp_lis_nodes_3999.diff
[CMD] [2021-06-07 17:33:24] tail -n +1 /app/public/converted_csv/ceram_sea_sp_lis_nodes_3999.csv >> /app/public/diff/ceram_sea_sp_lis_nodes_3999.diff
[CMD] [2021-06-07 17:33:24] echo "." >> /app/public/diff/ceram_sea_sp_lis_nodes_3999.diff
[INFO] [2021-06-07 17:33:25] Created diff: /app/public/diff/ceram_sea_sp_lis_nodes_3999.diff (3502 lines)
[INFO] [2021-06-07 17:33:25] ...occurrences (/app/public/data/ceram_sea_sp_lis/occurrence.tab)
[CMD] [2021-06-07 17:33:25] echo "0a" > /app/public/diff/ceram_sea_sp_lis_occurrences_3999.diff
[CMD] [2021-06-07 17:33:25] tail -n +1 /app/public/converted_csv/ceram_sea_sp_lis_occurrences_3999.csv >> /app/public/diff/ceram_sea_sp_lis_occurrences_3999.diff
[CMD] [2021-06-07 17:33:25] echo "." >> /app/public/diff/ceram_sea_sp_lis_occurrences_3999.diff
[INFO] [2021-06-07 17:33:25] Created diff: /app/public/diff/ceram_sea_sp_lis_occurrences_3999.diff (1711 lines)
[INFO] [2021-06-07 17:33:25] ...measurements (/app/public/data/ceram_sea_sp_lis/measurement_or_fact_specific.tab)
[CMD] [2021-06-07 17:33:25] echo "0a" > /app/public/diff/ceram_sea_sp_lis_measurements_3999.diff
[CMD] [2021-06-07 17:33:26] tail -n +1 /app/public/converted_csv/ceram_sea_sp_lis_measurements_3999.csv >> /app/public/diff/ceram_sea_sp_lis_measurements_3999.diff
[CMD] [2021-06-07 17:33:26] echo "." >> /app/public/diff/ceram_sea_sp_lis_measurements_3999.diff
[INFO] [2021-06-07 17:33:26] Created diff: /app/public/diff/ceram_sea_sp_lis_measurements_3999.diff (3420 lines)
[STOP] [2021-06-07 17:33:26] calculate_delta
[START] [2021-06-07 17:33:26] parse_diff_and_store
[INFO] [2021-06-07 17:33:26] Handling diff: /app/public/diff/ceram_sea_sp_lis_refs_3999.diff (4 lines)
[INFO] [2021-06-07 17:33:26] Loading refs diff file into memory (4 /app/public/diff/ceram_sea_sp_lis_refs_3999.diff lines)...
[INFO] [2021-06-07 17:33:27] Handling diff: /app/public/diff/ceram_sea_sp_lis_nodes_3999.diff (3502 lines)
[INFO] [2021-06-07 17:33:27] Loading nodes diff file into memory (3502 /app/public/diff/ceram_sea_sp_lis_nodes_3999.diff lines)...
[INFO] [2021-06-07 17:33:28] Handling diff: /app/public/diff/ceram_sea_sp_lis_occurrences_3999.diff (1711 lines)
[INFO] [2021-06-07 17:33:28] Loading occurrences diff file into memory (1711 /app/public/diff/ceram_sea_sp_lis_occurrences_3999.diff lines)...
[INFO] [2021-06-07 17:33:29] Handling diff: /app/public/diff/ceram_sea_sp_lis_measurements_3999.diff (3420 lines)
[INFO] [2021-06-07 17:33:29] Loading measurements diff file into memory (3420 /app/public/diff/ceram_sea_sp_lis_measurements_3999.diff lines)...
[INFO] [2021-06-07 17:33:30] Storing 2 References
[INFO] [2021-06-07 17:33:30] Processing group of 2 in 1 groups of 1000
[INFO] [2021-06-07 17:33:30] Average Time: 0.0
[INFO] [2021-06-07 17:33:30] Total Time: 1s
[INFO] [2021-06-07 17:33:30] Storing 3500 ScientificNames
[INFO] [2021-06-07 17:33:30] Processing group of 3500 in 4 groups of 1000
[INFO] [2021-06-07 17:33:31] Average Time: 0.27
[INFO] [2021-06-07 17:33:31] Total Time: 2s
[INFO] [2021-06-07 17:33:31] Storing 3500 Nodes
[INFO] [2021-06-07 17:33:31] Processing group of 3500 in 4 groups of 1000
[INFO] [2021-06-07 17:33:32] Average Time: 0.215
[INFO] [2021-06-07 17:33:32] Total Time: 1s
[INFO] [2021-06-07 17:33:32] Storing 1709 Occurrences
[INFO] [2021-06-07 17:33:32] Processing group of 1709 in 2 groups of 1000
[INFO] [2021-06-07 17:33:33] Average Time: 0.1
[INFO] [2021-06-07 17:33:33] Total Time: 1s
[INFO] [2021-06-07 17:33:33] Storing 3418 TraitsReferences
[INFO] [2021-06-07 17:33:33] Processing group of 3418 in 4 groups of 1000
[INFO] [2021-06-07 17:33:33] Average Time: 0.08
[INFO] [2021-06-07 17:33:33] Total Time: 1s
[INFO] [2021-06-07 17:33:33] Storing 3418 Traits
[INFO] [2021-06-07 17:33:33] Processing group of 3418 in 4 groups of 1000
[INFO] [2021-06-07 17:33:34] Average Time: 0.288
[INFO] [2021-06-07 17:33:34] Total Time: 2s
[STOP] [2021-06-07 17:33:34] parse_diff_and_store
[START] [2021-06-07 17:33:34] resolve_keys
[INFO] [2021-06-07 17:33:48] Occurrences to nodes (through scientific_names)...
[INFO] [2021-06-07 17:33:55] traits to occurrences...
[INFO] [2021-06-07 17:33:56] traits to nodes (through occurrences)...
[INFO] [2021-06-07 17:33:56] Traits to sex term...
[INFO] [2021-06-07 17:33:57] Traits to lifestage term...
[INFO] [2021-06-07 17:33:57] MetaTraits to traits...
[INFO] [2021-06-07 17:33:57] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-06-07 17:33:57] Assocs to occurrences...
[INFO] [2021-06-07 17:33:57] Assocs to nodes...
[INFO] [2021-06-07 17:33:57] Assoc to sex term...
[INFO] [2021-06-07 17:33:57] Assoc to lifestage term...
[INFO] [2021-06-07 17:33:57] MetaAssoc to assocs...
[STOP] [2021-06-07 17:33:57] resolve_keys
[START] [2021-06-07 17:33:57] hold_for_later_1
[STOP] [2021-06-07 17:33:57] hold_for_later_1
[START] [2021-06-07 17:33:57] hold_for_later_2
[STOP] [2021-06-07 17:33:57] hold_for_later_2
[START] [2021-06-07 17:33:57] resolve_missing_parents
[STOP] [2021-06-07 17:33:59] resolve_missing_parents
[START] [2021-06-07 17:33:59] rebuild_nodes
[START] [2021-06-07 17:33:59] Flattener#flatten
[START] [2021-06-07 17:33:59] Flattener#study_resource
[START] [2021-06-07 17:33:59] Flattener#build_ancestry
[STOP] [2021-06-07 17:33:59] Flattener#build_ancestry
[INFO] [2021-06-07 17:33:59] 3500 ancestry keys
[START] [2021-06-07 17:33:59] build_node_ancestors
[INFO] [2021-06-07 17:33:59] old ancestors deleted.
[STOP] [2021-06-07 17:34:00] build_node_ancestors
[START] [2021-06-07 17:34:01] Flattener#propagate_ancestor_ids
[STOP] [2021-06-07 17:34:01] Flattener#propagate_ancestor_ids
[STOP] [2021-06-07 17:34:01] Flattener#flatten
[STOP] [2021-06-07 17:34:01] rebuild_nodes
[START] [2021-06-07 17:34:01] resolve_missing_media_owners
[STOP] [2021-06-07 17:34:01] resolve_missing_media_owners
[START] [2021-06-07 17:34:01] sanitize_media_verbatims
[STOP] [2021-06-07 17:34:01] sanitize_media_verbatims
[START] [2021-06-07 17:34:01] queue_downloads
[STOP] [2021-06-07 17:34:01] queue_downloads
[START] [2021-06-07 17:34:01] parse_names
[WARN] [2021-06-07 17:34:01] I see 3500 names which still need to be parsed.
[STOP] [2021-06-07 17:34:04] parse_names
[START] [2021-06-07 17:34:04] denormalize_canonical_names_to_nodes
[STOP] [2021-06-07 17:34:04] denormalize_canonical_names_to_nodes
[START] [2021-06-07 17:34:04] match_nodes
[START] [2021-06-07 17:34:04] map_all_nodes_to_pages
[STOP] [2021-06-07 17:37:00] map_all_nodes_to_pages
[INFO] [2021-06-07 17:37:00] 119 Unmatched nodes (of 3500)! That's too many to output. Full list in /app/public/data/ceram_sea_sp_lis/unmatched_nodes.txt ; First 10: Canonical: Siphamia papuensis; Node#95788194; ResourceID: T100651; Canonical: Ptereleotridae; Node#95788045; ResourceID: T100502; Canonical: Scorpaeniformes; Node#95788007; ResourceID: T100464; Canonical: Stephanoberyciformes; Node#95788215; ResourceID: T100672; Canonical: Cetomimiformes; Node#95789223; ResourceID: T101681; Canonical: Hydrophiidae; Node#95787587; ResourceID: T100044; Canonical: Lapemis; Node#95787586; ResourceID: T100043; Canonical: Pleurogona; Node#95788714; ResourceID: T101171; Canonical: Enterogona; Node#95788840; ResourceID: T101298; Canonical: Hadromerida; Node#95787854; ResourceID: T100311
[START] [2021-06-07 17:37:00] update_nodes
[STOP] [2021-06-07 17:37:01] update_nodes
[STOP] [2021-06-07 17:37:01] match_nodes
[START] [2021-06-07 17:37:01] reindex_search
[STOP] [2021-06-07 17:37:05] reindex_search
[START] [2021-06-07 17:37:05] normalize_units
[STOP] [2021-06-07 17:37:05] normalize_units
[START] [2021-06-07 17:37:05] calculate_statistics
[STOP] [2021-06-07 17:37:05] calculate_statistics
[START] [2021-06-07 17:37:05] complete_harvest_instance
[START] [2021-06-07 17:37:05] overall_tsv_creation
[INFO] [2021-06-07 17:37:05] Processing group of 3500 in 1 batches of 10000
[INFO] [2021-06-07 17:37:53] 1709 Traits (unfiltered)...
[INFO] [2021-06-07 17:38:35] 1709 Traits (filtered)...
[INFO] [2021-06-07 17:38:38] 0 Associations (filtered)...
[INFO] [2021-06-07 17:38:38] 3418 metadata added.
[INFO] [2021-06-07 17:38:38] 0 metadata added.
[INFO] [2021-06-07 17:39:06] Average Time: 92.69
[INFO] [2021-06-07 17:39:06] Total Time: 2m1s
[STOP] [2021-06-07 17:39:06] overall_tsv_creation
[INFO] [2021-06-07 17:39:06] Done. Check your files:
[INFO] [2021-06-07 17:39:06] (3500 lines) /app/public/data/ceram_sea_sp_lis/publish_nodes.tsv
[INFO] [2021-06-07 17:39:06] (17904 lines) /app/public/data/ceram_sea_sp_lis/publish_node_ancestors.tsv
[INFO] [2021-06-07 17:39:06] (3500 lines) /app/public/data/ceram_sea_sp_lis/publish_scientific_names.tsv
[INFO] [2021-06-07 17:39:07] (1710 lines) /app/public/data/ceram_sea_sp_lis/publish_traits.tsv
[INFO] [2021-06-07 17:39:07] (3419 lines) /app/public/data/ceram_sea_sp_lis/publish_metadata.tsv
[STOP] [2021-06-07 17:39:07] complete_harvest_instance
[START] [2021-06-07 17:39:07] completed
[STOP] [2021-06-07 17:39:07] completed
[STOP] [2021-06-07 17:39:07] logged process, took 345.61
Latest Process