Stage:
completed
Fetched:
09 Jun 11:10
Validated:
09 Jun 11:10
Deltas Created
09 Jun 11:10
Units Normalized:
09 Jun 11:15
Ancestry Built:
09 Jun 11:10
Nodes Matched:
09 Jun 11:15
Names Parsed:
09 Jun 11:10
New Models Stored:
09 Jun 11:10
Indexed:
09 Jun 11:15
Completed:
09 Jun 11:18
Time to Harvest:
less than a minute
Harvesting Log
(169 lines)
[INFO] [2021-06-09 11:10:01] Created harvest instance #4007
[STOP] [2021-06-09 11:10:01] create_harvest_instance
[START] [2021-06-09 11:10:01] fetch_files
[STOP] [2021-06-09 11:10:01] fetch_files
[START] [2021-06-09 11:10:01] validate_each_file
[INFO] [2021-06-09 11:10:01] Looping over 4 formats...
[INFO] [2021-06-09 11:10:01] ...refs (/app/public/data/aegean_sea_sp_li/tb_references.txt)
[INFO] [2021-06-09 11:10:01] Valid: /app/public/converted_csv/aegean_sea_sp_li_refs_4007.csv (1 lines)
[INFO] [2021-06-09 11:10:01] ...nodes (/app/public/data/aegean_sea_sp_li/tb_taxon.txt)
[INFO] [2021-06-09 11:10:01] Valid: /app/public/converted_csv/aegean_sea_sp_li_nodes_4007.csv (5937 lines)
[INFO] [2021-06-09 11:10:01] ...occurrences (/app/public/data/aegean_sea_sp_li/tb_occurrence.txt)
[INFO] [2021-06-09 11:10:01] Valid: /app/public/converted_csv/aegean_sea_sp_li_occurrences_4007.csv (3104 lines)
[INFO] [2021-06-09 11:10:01] ...measurements (/app/public/data/aegean_sea_sp_li/tb_measurement.txt)
[INFO] [2021-06-09 11:10:02] Valid: /app/public/converted_csv/aegean_sea_sp_li_measurements_4007.csv (6208 lines)
[STOP] [2021-06-09 11:10:02] validate_each_file
[START] [2021-06-09 11:10:02] convert_to_csv
[INFO] [2021-06-09 11:10:02] Looping over 4 formats...
[INFO] [2021-06-09 11:10:02] ...refs (/app/public/data/aegean_sea_sp_li/tb_references.txt)
[CMD] [2021-06-09 11:10:02] /usr/bin/sort /app/public/converted_csv/aegean_sea_sp_li_refs_4007.csv > /app/public/converted_csv/aegean_sea_sp_li_refs_4007.csv_sorted
[INFO] [2021-06-09 11:10:02] Converted: /app/public/converted_csv/aegean_sea_sp_li_refs_4007.csv (1 lines)
[INFO] [2021-06-09 11:10:02] ...nodes (/app/public/data/aegean_sea_sp_li/tb_taxon.txt)
[CMD] [2021-06-09 11:10:02] /usr/bin/sort /app/public/converted_csv/aegean_sea_sp_li_nodes_4007.csv > /app/public/converted_csv/aegean_sea_sp_li_nodes_4007.csv_sorted
[INFO] [2021-06-09 11:10:02] Converted: /app/public/converted_csv/aegean_sea_sp_li_nodes_4007.csv (5937 lines)
[INFO] [2021-06-09 11:10:02] ...occurrences (/app/public/data/aegean_sea_sp_li/tb_occurrence.txt)
[CMD] [2021-06-09 11:10:02] /usr/bin/sort /app/public/converted_csv/aegean_sea_sp_li_occurrences_4007.csv > /app/public/converted_csv/aegean_sea_sp_li_occurrences_4007.csv_sorted
[INFO] [2021-06-09 11:10:02] Converted: /app/public/converted_csv/aegean_sea_sp_li_occurrences_4007.csv (3104 lines)
[INFO] [2021-06-09 11:10:02] ...measurements (/app/public/data/aegean_sea_sp_li/tb_measurement.txt)
[CMD] [2021-06-09 11:10:02] /usr/bin/sort /app/public/converted_csv/aegean_sea_sp_li_measurements_4007.csv > /app/public/converted_csv/aegean_sea_sp_li_measurements_4007.csv_sorted
[INFO] [2021-06-09 11:10:02] Converted: /app/public/converted_csv/aegean_sea_sp_li_measurements_4007.csv (6208 lines)
[STOP] [2021-06-09 11:10:02] convert_to_csv
[START] [2021-06-09 11:10:02] calculate_delta
[INFO] [2021-06-09 11:10:02] Looping over 4 formats...
[INFO] [2021-06-09 11:10:02] ...refs (/app/public/data/aegean_sea_sp_li/tb_references.txt)
[CMD] [2021-06-09 11:10:02] echo "0a" > /app/public/diff/aegean_sea_sp_li_refs_4007.diff
[CMD] [2021-06-09 11:10:02] tail -n +1 /app/public/converted_csv/aegean_sea_sp_li_refs_4007.csv >> /app/public/diff/aegean_sea_sp_li_refs_4007.diff
[CMD] [2021-06-09 11:10:03] echo "." >> /app/public/diff/aegean_sea_sp_li_refs_4007.diff
[INFO] [2021-06-09 11:10:03] Created diff: /app/public/diff/aegean_sea_sp_li_refs_4007.diff (3 lines)
[INFO] [2021-06-09 11:10:03] ...nodes (/app/public/data/aegean_sea_sp_li/tb_taxon.txt)
[CMD] [2021-06-09 11:10:03] echo "0a" > /app/public/diff/aegean_sea_sp_li_nodes_4007.diff
[CMD] [2021-06-09 11:10:03] tail -n +1 /app/public/converted_csv/aegean_sea_sp_li_nodes_4007.csv >> /app/public/diff/aegean_sea_sp_li_nodes_4007.diff
[CMD] [2021-06-09 11:10:03] echo "." >> /app/public/diff/aegean_sea_sp_li_nodes_4007.diff
[INFO] [2021-06-09 11:10:03] Created diff: /app/public/diff/aegean_sea_sp_li_nodes_4007.diff (5939 lines)
[INFO] [2021-06-09 11:10:03] ...occurrences (/app/public/data/aegean_sea_sp_li/tb_occurrence.txt)
[CMD] [2021-06-09 11:10:03] echo "0a" > /app/public/diff/aegean_sea_sp_li_occurrences_4007.diff
[CMD] [2021-06-09 11:10:03] tail -n +1 /app/public/converted_csv/aegean_sea_sp_li_occurrences_4007.csv >> /app/public/diff/aegean_sea_sp_li_occurrences_4007.diff
[CMD] [2021-06-09 11:10:03] echo "." >> /app/public/diff/aegean_sea_sp_li_occurrences_4007.diff
[INFO] [2021-06-09 11:10:03] Created diff: /app/public/diff/aegean_sea_sp_li_occurrences_4007.diff (3106 lines)
[INFO] [2021-06-09 11:10:03] ...measurements (/app/public/data/aegean_sea_sp_li/tb_measurement.txt)
[CMD] [2021-06-09 11:10:03] echo "0a" > /app/public/diff/aegean_sea_sp_li_measurements_4007.diff
[CMD] [2021-06-09 11:10:03] tail -n +1 /app/public/converted_csv/aegean_sea_sp_li_measurements_4007.csv >> /app/public/diff/aegean_sea_sp_li_measurements_4007.diff
[CMD] [2021-06-09 11:10:03] echo "." >> /app/public/diff/aegean_sea_sp_li_measurements_4007.diff
[INFO] [2021-06-09 11:10:04] Created diff: /app/public/diff/aegean_sea_sp_li_measurements_4007.diff (6210 lines)
[STOP] [2021-06-09 11:10:04] calculate_delta
[START] [2021-06-09 11:10:04] parse_diff_and_store
[INFO] [2021-06-09 11:10:04] Handling diff: /app/public/diff/aegean_sea_sp_li_refs_4007.diff (3 lines)
[INFO] [2021-06-09 11:10:04] Loading refs diff file into memory (3 /app/public/diff/aegean_sea_sp_li_refs_4007.diff lines)...
[INFO] [2021-06-09 11:10:04] Handling diff: /app/public/diff/aegean_sea_sp_li_nodes_4007.diff (5939 lines)
[INFO] [2021-06-09 11:10:04] Loading nodes diff file into memory (5939 /app/public/diff/aegean_sea_sp_li_nodes_4007.diff lines)...
[INFO] [2021-06-09 11:10:05] Handling diff: /app/public/diff/aegean_sea_sp_li_occurrences_4007.diff (3106 lines)
[INFO] [2021-06-09 11:10:06] Loading occurrences diff file into memory (3106 /app/public/diff/aegean_sea_sp_li_occurrences_4007.diff lines)...
[INFO] [2021-06-09 11:10:06] Handling diff: /app/public/diff/aegean_sea_sp_li_measurements_4007.diff (6210 lines)
[INFO] [2021-06-09 11:10:06] Loading measurements diff file into memory (6210 /app/public/diff/aegean_sea_sp_li_measurements_4007.diff lines)...
[INFO] [2021-06-09 11:10:08] Storing 1 References
[INFO] [2021-06-09 11:10:08] Processing group of 1 in 1 groups of 1000
[INFO] [2021-06-09 11:10:08] Average Time: 0.0
[INFO] [2021-06-09 11:10:08] Total Time: 1s
[INFO] [2021-06-09 11:10:08] Storing 5937 ScientificNames
[INFO] [2021-06-09 11:10:08] Processing group of 5937 in 6 groups of 1000
[INFO] [2021-06-09 11:10:10] Average Time: 0.312
[INFO] [2021-06-09 11:10:10] Total Time: 2s
[INFO] [2021-06-09 11:10:10] Storing 5937 Nodes
[INFO] [2021-06-09 11:10:10] Processing group of 5937 in 6 groups of 1000
[INFO] [2021-06-09 11:10:12] Average Time: 0.265
[INFO] [2021-06-09 11:10:12] Total Time: 2s
[INFO] [2021-06-09 11:10:12] Storing 3104 Occurrences
[INFO] [2021-06-09 11:10:12] Processing group of 3104 in 4 groups of 1000
[INFO] [2021-06-09 11:10:12] Average Time: 0.078
[INFO] [2021-06-09 11:10:12] Total Time: 1s
[INFO] [2021-06-09 11:10:12] Storing 6208 Traits
[INFO] [2021-06-09 11:10:12] Processing group of 6208 in 7 groups of 1000
[INFO] [2021-06-09 11:10:14] Average Time: 0.316
[INFO] [2021-06-09 11:10:14] Total Time: 3s
[INFO] [2021-06-09 11:10:14] last 3 / first 3: 1.13
[INFO] [2021-06-09 11:10:14] Std.Dev: 0.13784048752090222; Max: 0.48
[INFO] [2021-06-09 11:10:14] Storing 6208 TraitsReferences
[INFO] [2021-06-09 11:10:14] Processing group of 6208 in 7 groups of 1000
[INFO] [2021-06-09 11:10:15] Average Time: 0.053
[INFO] [2021-06-09 11:10:15] Total Time: 1s
[INFO] [2021-06-09 11:10:15] last 3 / first 3: 0.72
[INFO] [2021-06-09 11:10:15] Std.Dev: 0.0; Max: 0.06
[STOP] [2021-06-09 11:10:15] parse_diff_and_store
[START] [2021-06-09 11:10:15] resolve_keys
[INFO] [2021-06-09 11:10:30] Occurrences to nodes (through scientific_names)...
[INFO] [2021-06-09 11:10:31] traits to occurrences...
[INFO] [2021-06-09 11:10:33] traits to nodes (through occurrences)...
[INFO] [2021-06-09 11:10:33] Traits to sex term...
[INFO] [2021-06-09 11:10:34] Traits to lifestage term...
[INFO] [2021-06-09 11:10:36] MetaTraits to traits...
[INFO] [2021-06-09 11:10:36] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-06-09 11:10:36] Assocs to occurrences...
[INFO] [2021-06-09 11:10:36] Assocs to nodes...
[INFO] [2021-06-09 11:10:36] Assoc to sex term...
[INFO] [2021-06-09 11:10:36] Assoc to lifestage term...
[INFO] [2021-06-09 11:10:36] MetaAssoc to assocs...
[STOP] [2021-06-09 11:10:36] resolve_keys
[START] [2021-06-09 11:10:36] hold_for_later_1
[STOP] [2021-06-09 11:10:36] hold_for_later_1
[START] [2021-06-09 11:10:36] hold_for_later_2
[STOP] [2021-06-09 11:10:36] hold_for_later_2
[START] [2021-06-09 11:10:36] resolve_missing_parents
[STOP] [2021-06-09 11:10:38] resolve_missing_parents
[START] [2021-06-09 11:10:38] rebuild_nodes
[START] [2021-06-09 11:10:38] Flattener#flatten
[START] [2021-06-09 11:10:38] Flattener#study_resource
[START] [2021-06-09 11:10:38] Flattener#build_ancestry
[STOP] [2021-06-09 11:10:39] Flattener#build_ancestry
[INFO] [2021-06-09 11:10:39] 5937 ancestry keys
[START] [2021-06-09 11:10:39] build_node_ancestors
[INFO] [2021-06-09 11:10:39] old ancestors deleted.
[STOP] [2021-06-09 11:10:40] build_node_ancestors
[START] [2021-06-09 11:10:42] Flattener#propagate_ancestor_ids
[STOP] [2021-06-09 11:10:42] Flattener#propagate_ancestor_ids
[STOP] [2021-06-09 11:10:42] Flattener#flatten
[STOP] [2021-06-09 11:10:42] rebuild_nodes
[START] [2021-06-09 11:10:42] resolve_missing_media_owners
[STOP] [2021-06-09 11:10:42] resolve_missing_media_owners
[START] [2021-06-09 11:10:42] sanitize_media_verbatims
[STOP] [2021-06-09 11:10:42] sanitize_media_verbatims
[START] [2021-06-09 11:10:42] queue_downloads
[STOP] [2021-06-09 11:10:42] queue_downloads
[START] [2021-06-09 11:10:42] parse_names
[WARN] [2021-06-09 11:10:42] I see 5937 names which still need to be parsed.
[STOP] [2021-06-09 11:10:47] parse_names
[START] [2021-06-09 11:10:47] denormalize_canonical_names_to_nodes
[STOP] [2021-06-09 11:10:47] denormalize_canonical_names_to_nodes
[START] [2021-06-09 11:10:47] match_nodes
[START] [2021-06-09 11:10:47] map_all_nodes_to_pages
[STOP] [2021-06-09 11:15:36] map_all_nodes_to_pages
[INFO] [2021-06-09 11:15:36] 288 Unmatched nodes (of 5937)! That's too many to output. Full list in /app/public/data/aegean_sea_sp_li/unmatched_nodes.txt ; First 10: Canonical: Granuloreticulosea; Node#95872185; ResourceID: T100004; Canonical: Globigerinoides sacculifer; Node#95872335; ResourceID: T100154; Canonical: Globoturborotalita tenella; Node#95872364; ResourceID: T100183; Canonical: Globigerinella aequilateralis; Node#95873034; ResourceID: T100854; Canonical: Globorotalia inflata; Node#95872365; ResourceID: T100184; Canonical: Gyrodinium prunus; Node#95875473; ResourceID: T103301; Canonical: Cochlodinium brandtii; Node#95872418; ResourceID: T100237; Canonical: Cochlodinium brandti; Node#95872467; ResourceID: T100286; Canonical: Cochlodinium pupa; Node#95873222; ResourceID: T101042; Canonical: Gymnodinium conicum; Node#95875318; ResourceID: T103146
[START] [2021-06-09 11:15:36] update_nodes
[STOP] [2021-06-09 11:15:39] update_nodes
[STOP] [2021-06-09 11:15:39] match_nodes
[START] [2021-06-09 11:15:39] reindex_search
[STOP] [2021-06-09 11:15:44] reindex_search
[START] [2021-06-09 11:15:44] normalize_units
[STOP] [2021-06-09 11:15:44] normalize_units
[START] [2021-06-09 11:15:44] calculate_statistics
[STOP] [2021-06-09 11:15:45] calculate_statistics
[START] [2021-06-09 11:15:45] complete_harvest_instance
[START] [2021-06-09 11:15:45] overall_tsv_creation
[INFO] [2021-06-09 11:15:45] Processing group of 5937 in 1 batches of 10000
[INFO] [2021-06-09 11:16:39] 3104 Traits (unfiltered)...
[INFO] [2021-06-09 11:17:33] 3104 Traits (filtered)...
[INFO] [2021-06-09 11:17:35] 0 Associations (filtered)...
[INFO] [2021-06-09 11:17:36] 3104 metadata added.
[INFO] [2021-06-09 11:17:36] 0 metadata added.
[INFO] [2021-06-09 11:18:04] Average Time: 112.97
[INFO] [2021-06-09 11:18:04] Total Time: 2m19s
[STOP] [2021-06-09 11:18:04] overall_tsv_creation
[INFO] [2021-06-09 11:18:04] Done. Check your files:
[INFO] [2021-06-09 11:18:04] (5937 lines) /app/public/data/aegean_sea_sp_li/publish_nodes.tsv
[INFO] [2021-06-09 11:18:04] (30032 lines) /app/public/data/aegean_sea_sp_li/publish_node_ancestors.tsv
[INFO] [2021-06-09 11:18:04] (5937 lines) /app/public/data/aegean_sea_sp_li/publish_scientific_names.tsv
[INFO] [2021-06-09 11:18:04] (3105 lines) /app/public/data/aegean_sea_sp_li/publish_traits.tsv
[INFO] [2021-06-09 11:18:04] (3105 lines) /app/public/data/aegean_sea_sp_li/publish_metadata.tsv
[STOP] [2021-06-09 11:18:04] complete_harvest_instance
[START] [2021-06-09 11:18:04] completed
[STOP] [2021-06-09 11:18:04] completed
[STOP] [2021-06-09 11:18:04] logged process, took 483.39
Latest Process