Harvest for Mozambique Species List Created 09 Jun 10:12

Stage: completed
Fetched: 09 Jun 10:12
Validated: 09 Jun 10:12
Deltas Created 09 Jun 10:12
Units Normalized: 09 Jun 10:43
Ancestry Built: 09 Jun 10:16
Nodes Matched: 09 Jun 10:42
Names Parsed: 09 Jun 10:16
New Models Stored: 09 Jun 10:14
Indexed: 09 Jun 10:43
Completed: 09 Jun 10:56
Time to Harvest: 1 minute

Harvesting Log

(199 lines)
[INFO] [2021-06-09 10:12:29] Created harvest instance #4004
[STOP] [2021-06-09 10:12:29] create_harvest_instance
[START] [2021-06-09 10:12:29] fetch_files
[STOP] [2021-06-09 10:12:29] fetch_files
[START] [2021-06-09 10:12:29] validate_each_file
[INFO] [2021-06-09 10:12:29] Looping over 4 formats...
[INFO] [2021-06-09 10:12:29] ...refs (/app/public/data/mozambique_sp_li/reference.tab)
[INFO] [2021-06-09 10:12:29] Valid: /app/public/converted_csv/mozambique_sp_li_refs_4004.csv (587 lines)
[INFO] [2021-06-09 10:12:29] ...nodes (/app/public/data/mozambique_sp_li/taxon.tab)
[INFO] [2021-06-09 10:12:30] Valid: /app/public/converted_csv/mozambique_sp_li_nodes_4004.csv (23880 lines)
[INFO] [2021-06-09 10:12:30] ...occurrences (/app/public/data/mozambique_sp_li/occurrence_specific.tab)
[INFO] [2021-06-09 10:12:31] Valid: /app/public/converted_csv/mozambique_sp_li_occurrences_4004.csv (23880 lines)
[INFO] [2021-06-09 10:12:31] ...measurements (/app/public/data/mozambique_sp_li/measurement_or_fact_specific.tab)
[INFO] [2021-06-09 10:12:40] Valid: /app/public/converted_csv/mozambique_sp_li_measurements_4004.csv (97305 lines)
[STOP] [2021-06-09 10:12:40] validate_each_file
[START] [2021-06-09 10:12:40] convert_to_csv
[INFO] [2021-06-09 10:12:40] Looping over 4 formats...
[INFO] [2021-06-09 10:12:40] ...refs (/app/public/data/mozambique_sp_li/reference.tab)
[CMD] [2021-06-09 10:12:40] /usr/bin/sort /app/public/converted_csv/mozambique_sp_li_refs_4004.csv > /app/public/converted_csv/mozambique_sp_li_refs_4004.csv_sorted
[INFO] [2021-06-09 10:12:40] Converted: /app/public/converted_csv/mozambique_sp_li_refs_4004.csv (587 lines)
[INFO] [2021-06-09 10:12:40] ...nodes (/app/public/data/mozambique_sp_li/taxon.tab)
[CMD] [2021-06-09 10:12:40] /usr/bin/sort /app/public/converted_csv/mozambique_sp_li_nodes_4004.csv > /app/public/converted_csv/mozambique_sp_li_nodes_4004.csv_sorted
[INFO] [2021-06-09 10:12:40] Converted: /app/public/converted_csv/mozambique_sp_li_nodes_4004.csv (23880 lines)
[INFO] [2021-06-09 10:12:40] ...occurrences (/app/public/data/mozambique_sp_li/occurrence_specific.tab)
[CMD] [2021-06-09 10:12:40] /usr/bin/sort /app/public/converted_csv/mozambique_sp_li_occurrences_4004.csv > /app/public/converted_csv/mozambique_sp_li_occurrences_4004.csv_sorted
[INFO] [2021-06-09 10:12:40] Converted: /app/public/converted_csv/mozambique_sp_li_occurrences_4004.csv (23880 lines)
[INFO] [2021-06-09 10:12:40] ...measurements (/app/public/data/mozambique_sp_li/measurement_or_fact_specific.tab)
[CMD] [2021-06-09 10:12:40] /usr/bin/sort /app/public/converted_csv/mozambique_sp_li_measurements_4004.csv > /app/public/converted_csv/mozambique_sp_li_measurements_4004.csv_sorted
[INFO] [2021-06-09 10:12:40] Converted: /app/public/converted_csv/mozambique_sp_li_measurements_4004.csv (97305 lines)
[STOP] [2021-06-09 10:12:40] convert_to_csv
[START] [2021-06-09 10:12:40] calculate_delta
[INFO] [2021-06-09 10:12:40] Looping over 4 formats...
[INFO] [2021-06-09 10:12:40] ...refs (/app/public/data/mozambique_sp_li/reference.tab)
[CMD] [2021-06-09 10:12:40] echo "0a" > /app/public/diff/mozambique_sp_li_refs_4004.diff
[CMD] [2021-06-09 10:12:40] tail -n +1 /app/public/converted_csv/mozambique_sp_li_refs_4004.csv >> /app/public/diff/mozambique_sp_li_refs_4004.diff
[CMD] [2021-06-09 10:12:40] echo "." >> /app/public/diff/mozambique_sp_li_refs_4004.diff
[INFO] [2021-06-09 10:12:40] Created diff: /app/public/diff/mozambique_sp_li_refs_4004.diff (589 lines)
[INFO] [2021-06-09 10:12:40] ...nodes (/app/public/data/mozambique_sp_li/taxon.tab)
[CMD] [2021-06-09 10:12:40] echo "0a" > /app/public/diff/mozambique_sp_li_nodes_4004.diff
[CMD] [2021-06-09 10:12:41] tail -n +1 /app/public/converted_csv/mozambique_sp_li_nodes_4004.csv >> /app/public/diff/mozambique_sp_li_nodes_4004.diff
[CMD] [2021-06-09 10:12:41] echo "." >> /app/public/diff/mozambique_sp_li_nodes_4004.diff
[INFO] [2021-06-09 10:12:41] Created diff: /app/public/diff/mozambique_sp_li_nodes_4004.diff (23882 lines)
[INFO] [2021-06-09 10:12:41] ...occurrences (/app/public/data/mozambique_sp_li/occurrence_specific.tab)
[CMD] [2021-06-09 10:12:41] echo "0a" > /app/public/diff/mozambique_sp_li_occurrences_4004.diff
[CMD] [2021-06-09 10:12:41] tail -n +1 /app/public/converted_csv/mozambique_sp_li_occurrences_4004.csv >> /app/public/diff/mozambique_sp_li_occurrences_4004.diff
[CMD] [2021-06-09 10:12:41] echo "." >> /app/public/diff/mozambique_sp_li_occurrences_4004.diff
[INFO] [2021-06-09 10:12:41] Created diff: /app/public/diff/mozambique_sp_li_occurrences_4004.diff (23882 lines)
[INFO] [2021-06-09 10:12:41] ...measurements (/app/public/data/mozambique_sp_li/measurement_or_fact_specific.tab)
[CMD] [2021-06-09 10:12:41] echo "0a" > /app/public/diff/mozambique_sp_li_measurements_4004.diff
[CMD] [2021-06-09 10:12:41] tail -n +1 /app/public/converted_csv/mozambique_sp_li_measurements_4004.csv >> /app/public/diff/mozambique_sp_li_measurements_4004.diff
[CMD] [2021-06-09 10:12:41] echo "." >> /app/public/diff/mozambique_sp_li_measurements_4004.diff
[INFO] [2021-06-09 10:12:42] Created diff: /app/public/diff/mozambique_sp_li_measurements_4004.diff (97307 lines)
[STOP] [2021-06-09 10:12:42] calculate_delta
[START] [2021-06-09 10:12:42] parse_diff_and_store
[INFO] [2021-06-09 10:12:42] Handling diff: /app/public/diff/mozambique_sp_li_refs_4004.diff (589 lines)
[INFO] [2021-06-09 10:12:42] Loading refs diff file into memory (589 /app/public/diff/mozambique_sp_li_refs_4004.diff lines)...
[INFO] [2021-06-09 10:12:42] Handling diff: /app/public/diff/mozambique_sp_li_nodes_4004.diff (23882 lines)
[INFO] [2021-06-09 10:12:42] Loading nodes diff file into memory (23882 /app/public/diff/mozambique_sp_li_nodes_4004.diff lines)...
[WARN] [2021-06-09 10:12:49] Filtered Scientific Name `Charaxes brutus Cramer, 1779/80` to `Charaxes brutus Cramer, 177980`
[WARN] [2021-06-09 10:12:49] Filtered Scientific Name `Charaxes castor Cramer, 1775/76` to `Charaxes castor Cramer, 177576`
[INFO] [2021-06-09 10:12:52] Handling diff: /app/public/diff/mozambique_sp_li_occurrences_4004.diff (23882 lines)
[INFO] [2021-06-09 10:12:52] Loading occurrences diff file into memory (23882 /app/public/diff/mozambique_sp_li_occurrences_4004.diff lines)...
[INFO] [2021-06-09 10:12:55] Handling diff: /app/public/diff/mozambique_sp_li_measurements_4004.diff (97307 lines)
[INFO] [2021-06-09 10:12:55] Loading measurements diff file into memory (97307 /app/public/diff/mozambique_sp_li_measurements_4004.diff lines)...
[INFO] [2021-06-09 10:13:32] Storing 587 References
[INFO] [2021-06-09 10:13:32] Processing group of 587 in 1 groups of 1000
[INFO] [2021-06-09 10:13:32] Average Time: 0.11
[INFO] [2021-06-09 10:13:32] Total Time: 1s
[INFO] [2021-06-09 10:13:32] Storing 33918 ScientificNames
[INFO] [2021-06-09 10:13:32] Processing group of 33918 in 34 groups of 1000
[INFO] [2021-06-09 10:13:42] Average Time: 0.3
[INFO] [2021-06-09 10:13:42] Total Time: 11s
[INFO] [2021-06-09 10:13:42] last 3 / first 3: 0.84
[INFO] [2021-06-09 10:13:42] Std.Dev: 0.03162277660168379; Max: 0.38
[INFO] [2021-06-09 10:13:42] Storing 33918 Nodes
[INFO] [2021-06-09 10:13:42] Processing group of 33918 in 34 groups of 1000
[INFO] [2021-06-09 10:13:55] Average Time: 0.365
[INFO] [2021-06-09 10:13:55] Total Time: 13s
[INFO] [2021-06-09 10:13:55] last 3 / first 3: 0.62
[INFO] [2021-06-09 10:13:55] Std.Dev: 0.08366600265340755; Max: 0.73
[INFO] [2021-06-09 10:13:55] Storing 23880 Occurrences
[INFO] [2021-06-09 10:13:55] Processing group of 23880 in 24 groups of 1000
[INFO] [2021-06-09 10:13:58] Average Time: 0.12
[INFO] [2021-06-09 10:13:58] Total Time: 3s
[INFO] [2021-06-09 10:13:58] last 3 / first 3: 0.76
[INFO] [2021-06-09 10:13:58] Std.Dev: 0.03162277660168379; Max: 0.23
[INFO] [2021-06-09 10:13:58] Storing 49545 TraitsReferences
[INFO] [2021-06-09 10:13:58] Processing group of 49545 in 50 groups of 1000
[INFO] [2021-06-09 10:14:03] Average Time: 0.102
[INFO] [2021-06-09 10:14:03] Total Time: 6s
[INFO] [2021-06-09 10:14:03] last 3 / first 3: 0.52
[INFO] [2021-06-09 10:14:03] Std.Dev: 0.06324555320336758; Max: 0.23
[INFO] [2021-06-09 10:14:03] Storing 97305 Traits
[INFO] [2021-06-09 10:14:03] Processing group of 97305 in 98 groups of 1000
[INFO] [2021-06-09 10:14:36] Average Time: 0.33
[INFO] [2021-06-09 10:14:36] Total Time: 33s
[INFO] [2021-06-09 10:14:36] last 3 / first 3: 0.72
[INFO] [2021-06-09 10:14:36] Std.Dev: 0.10954451150103323; Max: 0.92
[INFO] [2021-06-09 10:14:36] Storing 23880 MetaTraits
[INFO] [2021-06-09 10:14:36] Processing group of 23880 in 24 groups of 1000
[INFO] [2021-06-09 10:14:39] Average Time: 0.124
[INFO] [2021-06-09 10:14:39] Total Time: 4s
[INFO] [2021-06-09 10:14:39] last 3 / first 3: 0.97
[INFO] [2021-06-09 10:14:39] Std.Dev: 0.03162277660168379; Max: 0.19
[STOP] [2021-06-09 10:14:39] parse_diff_and_store
[START] [2021-06-09 10:14:39] resolve_keys
[INFO] [2021-06-09 10:15:34] Occurrences to nodes (through scientific_names)...
[INFO] [2021-06-09 10:15:36] traits to occurrences...
[INFO] [2021-06-09 10:15:38] traits to nodes (through occurrences)...
[INFO] [2021-06-09 10:15:39] Traits to sex term...
[INFO] [2021-06-09 10:15:40] Traits to lifestage term...
[INFO] [2021-06-09 10:15:41] MetaTraits to traits...
[INFO] [2021-06-09 10:15:42] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-06-09 10:15:53] Assocs to occurrences...
[INFO] [2021-06-09 10:15:53] Assocs to nodes...
[INFO] [2021-06-09 10:15:53] Assoc to sex term...
[INFO] [2021-06-09 10:15:53] Assoc to lifestage term...
[INFO] [2021-06-09 10:15:53] MetaAssoc to assocs...
[STOP] [2021-06-09 10:15:53] resolve_keys
[START] [2021-06-09 10:15:53] hold_for_later_1
[STOP] [2021-06-09 10:15:53] hold_for_later_1
[START] [2021-06-09 10:15:53] hold_for_later_2
[STOP] [2021-06-09 10:15:53] hold_for_later_2
[START] [2021-06-09 10:15:53] resolve_missing_parents
[STOP] [2021-06-09 10:15:54] resolve_missing_parents
[START] [2021-06-09 10:15:54] rebuild_nodes
[START] [2021-06-09 10:15:54] Flattener#flatten
[START] [2021-06-09 10:15:54] Flattener#study_resource
[START] [2021-06-09 10:15:55] Flattener#build_ancestry
[STOP] [2021-06-09 10:15:58] Flattener#build_ancestry
[INFO] [2021-06-09 10:15:58] 33918 ancestry keys
[START] [2021-06-09 10:15:58] build_node_ancestors
[INFO] [2021-06-09 10:15:58] old ancestors deleted.
[STOP] [2021-06-09 10:16:14] build_node_ancestors
[START] [2021-06-09 10:16:19] Flattener#propagate_ancestor_ids
[STOP] [2021-06-09 10:16:24] Flattener#propagate_ancestor_ids
[STOP] [2021-06-09 10:16:24] Flattener#flatten
[STOP] [2021-06-09 10:16:24] rebuild_nodes
[START] [2021-06-09 10:16:24] resolve_missing_media_owners
[STOP] [2021-06-09 10:16:24] resolve_missing_media_owners
[START] [2021-06-09 10:16:24] sanitize_media_verbatims
[STOP] [2021-06-09 10:16:24] sanitize_media_verbatims
[START] [2021-06-09 10:16:24] queue_downloads
[STOP] [2021-06-09 10:16:24] queue_downloads
[START] [2021-06-09 10:16:24] parse_names
[WARN] [2021-06-09 10:16:24] I see 33918 names which still need to be parsed.
[WARN] [2021-06-09 10:16:47] I see 137 names which still need to be parsed.
[STOP] [2021-06-09 10:16:48] parse_names
[START] [2021-06-09 10:16:48] denormalize_canonical_names_to_nodes
[STOP] [2021-06-09 10:16:48] denormalize_canonical_names_to_nodes
[START] [2021-06-09 10:16:48] match_nodes
[START] [2021-06-09 10:16:49] map_all_nodes_to_pages
[STOP] [2021-06-09 10:42:25] map_all_nodes_to_pages
[INFO] [2021-06-09 10:42:25] 4298 Unmatched nodes (of 33918)! That's too many to output. Full list in /app/public/data/mozambique_sp_li/unmatched_nodes.txt ; First 10: Canonical: BOLD:AAB9167; Node#95831404; ResourceID: 10000055; Canonical: BOLD:AAC9069; Node#95832583; ResourceID: 10395052; Canonical: BOLD:AAE4025; Node#95831639; ResourceID: 10057465; Canonical: BOLD:AAB0201; Node#95831969; ResourceID: 10172524; Canonical: BOLD:AAA9362; Node#95831972; ResourceID: 10172763; Canonical: BOLD:AAD2621; Node#95832375; ResourceID: 10323511; Canonical: BOLD:ACE9749; Node#95832942; ResourceID: 10510842; Canonical: BOLD:AAB3207; Node#95833090; ResourceID: 10568952; Canonical: BOLD:AAE3994; Node#95865071; ResourceID: 9836311; Canonical: BOLD:AAE4046; Node#95865206; ResourceID: 9928966
[START] [2021-06-09 10:42:25] update_nodes
[STOP] [2021-06-09 10:42:39] update_nodes
[STOP] [2021-06-09 10:42:39] match_nodes
[START] [2021-06-09 10:42:39] reindex_search
[STOP] [2021-06-09 10:43:12] reindex_search
[START] [2021-06-09 10:43:12] normalize_units
[STOP] [2021-06-09 10:43:12] normalize_units
[START] [2021-06-09 10:43:12] calculate_statistics
[STOP] [2021-06-09 10:43:14] calculate_statistics
[START] [2021-06-09 10:43:14] complete_harvest_instance
[START] [2021-06-09 10:43:14] overall_tsv_creation
[INFO] [2021-06-09 10:43:14] Processing group of 33918 in 4 batches of 10000
[INFO] [2021-06-09 10:45:52] 6195 Traits (unfiltered)...
[INFO] [2021-06-09 10:47:09] 6195 Traits (filtered)...
[INFO] [2021-06-09 10:47:12] 0 Associations (filtered)...
[INFO] [2021-06-09 10:47:15] 18174 metadata added.
[INFO] [2021-06-09 10:47:15] 0 metadata added.
[INFO] [2021-06-09 10:49:30] 7066 Traits (unfiltered)...
[INFO] [2021-06-09 10:50:54] 7066 Traits (filtered)...
[INFO] [2021-06-09 10:50:57] 0 Associations (filtered)...
[INFO] [2021-06-09 10:51:02] 38260 metadata added.
[INFO] [2021-06-09 10:51:02] 0 metadata added.
[INFO] [2021-06-09 10:52:39] 7546 Traits (unfiltered)...
[INFO] [2021-06-09 10:54:01] 7546 Traits (filtered)...
[INFO] [2021-06-09 10:54:04] 0 Associations (filtered)...
[INFO] [2021-06-09 10:54:10] 32860 metadata added.
[INFO] [2021-06-09 10:54:10] 0 metadata added.
[INFO] [2021-06-09 10:55:28] 3073 Traits (unfiltered)...
[INFO] [2021-06-09 10:56:24] 3073 Traits (filtered)...
[INFO] [2021-06-09 10:56:27] 0 Associations (filtered)...
[INFO] [2021-06-09 10:56:28] 9796 metadata added.
[INFO] [2021-06-09 10:56:28] 0 metadata added.
[INFO] [2021-06-09 10:56:56] Average Time: 150.013
[INFO] [2021-06-09 10:56:56] Total Time: 13m43s
[STOP] [2021-06-09 10:56:56] overall_tsv_creation
[INFO] [2021-06-09 10:56:56] Done. Check your files:
[INFO] [2021-06-09 10:56:56] (33908 lines) /app/public/data/mozambique_sp_li/publish_nodes.tsv
[INFO] [2021-06-09 10:56:56] (183525 lines) /app/public/data/mozambique_sp_li/publish_node_ancestors.tsv
[INFO] [2021-06-09 10:56:56] (33918 lines) /app/public/data/mozambique_sp_li/publish_scientific_names.tsv
[INFO] [2021-06-09 10:56:56] (23881 lines) /app/public/data/mozambique_sp_li/publish_traits.tsv
[INFO] [2021-06-09 10:56:56] (99091 lines) /app/public/data/mozambique_sp_li/publish_metadata.tsv
[STOP] [2021-06-09 10:56:57] complete_harvest_instance
[START] [2021-06-09 10:56:57] completed
[STOP] [2021-06-09 10:56:57] completed
[STOP] [2021-06-09 10:56:57] logged process, took 2667.73

Latest Process