Harvest for Malacca Strait Species List Created 23 Dec 18:27

Stage: completed
Fetched: 23 Dec 18:27
Validated: 23 Dec 18:27
Deltas Created 23 Dec 18:27
Units Normalized: 23 Dec 18:49
Ancestry Built: 23 Dec 18:28
Nodes Matched: 23 Dec 18:49
Names Parsed: 23 Dec 18:28
New Models Stored: 23 Dec 18:28
Indexed: 23 Dec 18:49
Completed: 23 Dec 18:51
Time to Harvest: less than a minute

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2019-12-23 18:27:25 -0500 by logger.rb/56815
[START] [2019-12-23 18:27:25] logged process
[START] [2019-12-23 18:27:25] create_harvest_instance
[STOP] [2019-12-23 18:27:26] create_harvest_instance
[START] [2019-12-23 18:27:26] fetch_files
[STOP] [2019-12-23 18:27:26] fetch_files
[START] [2019-12-23 18:27:26] validate_each_file
[STOP] [2019-12-23 18:27:27] validate_each_file
[START] [2019-12-23 18:27:27] convert_to_csv
[CMD] [2019-12-23 18:27:27] /usr/bin/sort /app/public/converted_csv/malacca_strait_s_refs_19474.csv > /app/public/converted_csv/malacca_strait_s_refs_19474.csv_sorted
[CMD] [2019-12-23 18:27:28] /usr/bin/sort /app/public/converted_csv/malacca_strait_s_nodes_19475.csv > /app/public/converted_csv/malacca_strait_s_nodes_19475.csv_sorted
[CMD] [2019-12-23 18:27:28] /usr/bin/sort /app/public/converted_csv/malacca_strait_s_occurrences_19476.csv > /app/public/converted_csv/malacca_strait_s_occurrences_19476.csv_sorted
[CMD] [2019-12-23 18:27:29] /usr/bin/sort /app/public/converted_csv/malacca_strait_s_measurements_19477.csv > /app/public/converted_csv/malacca_strait_s_measurements_19477.csv_sorted
[STOP] [2019-12-23 18:27:30] convert_to_csv
[START] [2019-12-23 18:27:30] calculate_delta
[CMD] [2019-12-23 18:27:30] echo "0a" > /app/public/diff/malacca_strait_s_refs_19474.diff
[CMD] [2019-12-23 18:27:30] tail -n +1 /app/public/converted_csv/malacca_strait_s_refs_19474.csv >> /app/public/diff/malacca_strait_s_refs_19474.diff
[CMD] [2019-12-23 18:27:31] echo "." >> /app/public/diff/malacca_strait_s_refs_19474.diff
[CMD] [2019-12-23 18:27:32] echo "0a" > /app/public/diff/malacca_strait_s_nodes_19475.diff
[CMD] [2019-12-23 18:27:32] tail -n +1 /app/public/converted_csv/malacca_strait_s_nodes_19475.csv >> /app/public/diff/malacca_strait_s_nodes_19475.diff
[CMD] [2019-12-23 18:27:33] echo "." >> /app/public/diff/malacca_strait_s_nodes_19475.diff
[CMD] [2019-12-23 18:27:33] echo "0a" > /app/public/diff/malacca_strait_s_occurrences_19476.diff
[CMD] [2019-12-23 18:27:34] tail -n +1 /app/public/converted_csv/malacca_strait_s_occurrences_19476.csv >> /app/public/diff/malacca_strait_s_occurrences_19476.diff
[CMD] [2019-12-23 18:27:35] echo "." >> /app/public/diff/malacca_strait_s_occurrences_19476.diff
[CMD] [2019-12-23 18:27:35] echo "0a" > /app/public/diff/malacca_strait_s_measurements_19477.diff
[CMD] [2019-12-23 18:27:36] tail -n +1 /app/public/converted_csv/malacca_strait_s_measurements_19477.csv >> /app/public/diff/malacca_strait_s_measurements_19477.diff
[CMD] [2019-12-23 18:27:37] echo "." >> /app/public/diff/malacca_strait_s_measurements_19477.diff
[STOP] [2019-12-23 18:27:37] calculate_delta
[START] [2019-12-23 18:27:37] parse_diff_and_store
[INFO] [2019-12-23 18:27:38] Loading refs diff file into memory (true lines)...
[INFO] [2019-12-23 18:27:38] Loading nodes diff file into memory (true lines)...
[INFO] [2019-12-23 18:27:41] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-12-23 18:27:42] Loading measurements diff file into memory (true lines)...
[INFO] [2019-12-23 18:27:57] Storing 2 References
[INFO] [2019-12-23 18:27:57] Processing group of 2 in 1 groups of 1000
[INFO] [2019-12-23 18:27:57] Average Time: 0.0
[INFO] [2019-12-23 18:27:57] Total Time: 1s
[INFO] [2019-12-23 18:27:57] Storing 4935 ScientificNames
[INFO] [2019-12-23 18:27:57] Processing group of 4935 in 5 groups of 1000
[INFO] [2019-12-23 18:27:59] Average Time: 0.368
[INFO] [2019-12-23 18:27:59] Total Time: 2s
[INFO] [2019-12-23 18:27:59] Storing 4935 Nodes
[INFO] [2019-12-23 18:27:59] Processing group of 4935 in 5 groups of 1000
[INFO] [2019-12-23 18:28:00] Average Time: 0.308
[INFO] [2019-12-23 18:28:00] Total Time: 2s
[INFO] [2019-12-23 18:28:00] Storing 2709 Occurrences
[INFO] [2019-12-23 18:28:00] Processing group of 2709 in 3 groups of 1000
[INFO] [2019-12-23 18:28:00] Average Time: 0.12
[INFO] [2019-12-23 18:28:00] Total Time: 1s
[INFO] [2019-12-23 18:28:00] Storing 5418 TraitsReferences
[INFO] [2019-12-23 18:28:00] Processing group of 5418 in 6 groups of 1000
[INFO] [2019-12-23 18:28:01] Average Time: 0.082
[INFO] [2019-12-23 18:28:01] Total Time: 1s
[INFO] [2019-12-23 18:28:01] Storing 5418 Traits
[INFO] [2019-12-23 18:28:01] Processing group of 5418 in 6 groups of 1000
[INFO] [2019-12-23 18:28:03] Average Time: 0.305
[INFO] [2019-12-23 18:28:03] Total Time: 2s
[INFO] [2019-12-23 18:28:03] Storing 5418 MetaTraits
[INFO] [2019-12-23 18:28:03] Processing group of 5418 in 6 groups of 1000
[INFO] [2019-12-23 18:28:04] Average Time: 0.143
[INFO] [2019-12-23 18:28:04] Total Time: 1s
[STOP] [2019-12-23 18:28:04] parse_diff_and_store
[START] [2019-12-23 18:28:04] resolve_keys
[INFO] [2019-12-23 18:28:31] Occurrences to nodes (through scientific_names)...
[INFO] [2019-12-23 18:28:33] traits to occurrences...
[INFO] [2019-12-23 18:28:34] traits to nodes (through occurrences)...
[INFO] [2019-12-23 18:28:34] Traits to sex term...
[INFO] [2019-12-23 18:28:35] Traits to lifestage term...
[INFO] [2019-12-23 18:28:36] MetaTraits to traits...
[INFO] [2019-12-23 18:28:37] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-12-23 18:28:38] Assocs to occurrences...
[INFO] [2019-12-23 18:28:38] Assocs to nodes...
[INFO] [2019-12-23 18:28:38] Assoc to sex term...
[INFO] [2019-12-23 18:28:38] Assoc to lifestage term...
[STOP] [2019-12-23 18:28:38] resolve_keys
[START] [2019-12-23 18:28:38] hold_for_later_1
[STOP] [2019-12-23 18:28:38] hold_for_later_1
[START] [2019-12-23 18:28:38] hold_for_later_2
[STOP] [2019-12-23 18:28:38] hold_for_later_2
[START] [2019-12-23 18:28:38] resolve_missing_parents
[STOP] [2019-12-23 18:28:47] resolve_missing_parents
[START] [2019-12-23 18:28:47] rebuild_nodes
[START] [2019-12-23 18:28:47] Flattener#flatten
[START] [2019-12-23 18:28:47] Flattener#study_resource
[START] [2019-12-23 18:28:47] Flattener#build_ancestry
[STOP] [2019-12-23 18:28:48] Flattener#build_ancestry
[INFO] [2019-12-23 18:28:48] 4935 ancestry keys
[START] [2019-12-23 18:28:48] build_node_ancestors
[INFO] [2019-12-23 18:28:48] old ancestors deleted.
[STOP] [2019-12-23 18:28:50] build_node_ancestors
[START] [2019-12-23 18:28:52] Flattener#propagate_ancestor_ids
[STOP] [2019-12-23 18:28:52] Flattener#propagate_ancestor_ids
[STOP] [2019-12-23 18:28:52] Flattener#flatten
[STOP] [2019-12-23 18:28:52] rebuild_nodes
[START] [2019-12-23 18:28:52] resolve_missing_media_owners
[STOP] [2019-12-23 18:28:52] resolve_missing_media_owners
[START] [2019-12-23 18:28:52] sanitize_media_verbatims
[STOP] [2019-12-23 18:28:52] sanitize_media_verbatims
[START] [2019-12-23 18:28:53] queue_downloads
[STOP] [2019-12-23 18:28:53] queue_downloads
[START] [2019-12-23 18:28:53] parse_names
[WARN] [2019-12-23 18:28:53] I see 4935 names which still need to be parsed.
[STOP] [2019-12-23 18:28:57] parse_names
[START] [2019-12-23 18:28:57] denormalize_canonical_names_to_nodes
[STOP] [2019-12-23 18:28:57] denormalize_canonical_names_to_nodes
[START] [2019-12-23 18:28:57] match_nodes
[START] [2019-12-23 18:28:57] map_all_nodes_to_pages
[STOP] [2019-12-23 18:49:15] map_all_nodes_to_pages
[INFO] [2019-12-23 18:49:15] 156 Unmatched nodes (of 4935)! That's too many to output. First 10: Limicola (#62108443); Limicola falcinellus (#62108442); Limnodromus (#62108928); Egretta intermedia (#62106802); Megalaima (#62106842); Megalaima haemacephala (#62106841); Eudynamys scolopacea (#62111447); Anas querquedula (#62109447); Ostorhinchus molluccensis (#62109510); Apogon striatus (#62111196)
[START] [2019-12-23 18:49:15] update_nodes
[STOP] [2019-12-23 18:49:17] update_nodes
[STOP] [2019-12-23 18:49:17] match_nodes
[START] [2019-12-23 18:49:17] reindex_search
[STOP] [2019-12-23 18:49:33] reindex_search
[START] [2019-12-23 18:49:33] normalize_units
[STOP] [2019-12-23 18:49:33] normalize_units
[START] [2019-12-23 18:49:33] calculate_statistics
[STOP] [2019-12-23 18:49:33] calculate_statistics
[START] [2019-12-23 18:49:33] complete_harvest_instance
[START] [2019-12-23 18:49:33] overall_tsv_creation
[INFO] [2019-12-23 18:49:33] Processing group of 4935 in 1 batches of 10000
[INFO] [2019-12-23 18:50:44] 2709 Traits (unfiltered)...
[INFO] [2019-12-23 18:50:58] 2709 Traits (filtered)...
[INFO] [2019-12-23 18:50:58] 0 Associations (filtered)...
[INFO] [2019-12-23 18:51:43] 13545 metadata added.
[INFO] [2019-12-23 18:51:43] 0 metadata added.
[INFO] [2019-12-23 18:51:43] Average Time: 104.38
[INFO] [2019-12-23 18:51:43] Total Time: 2m10s
[STOP] [2019-12-23 18:51:43] overall_tsv_creation
[INFO] [2019-12-23 18:51:43] Done. Check your files:
[INFO] [2019-12-23 18:51:44] (4935 lines) /app/public/data/malacca_strait_s/publish_nodes.tsv
[INFO] [2019-12-23 18:51:44] (25799 lines) /app/public/data/malacca_strait_s/publish_node_ancestors.tsv
[INFO] [2019-12-23 18:51:45] (4935 lines) /app/public/data/malacca_strait_s/publish_scientific_names.tsv
[INFO] [2019-12-23 18:51:45] (2710 lines) /app/public/data/malacca_strait_s/publish_traits.tsv
[INFO] [2019-12-23 18:51:46] (13546 lines) /app/public/data/malacca_strait_s/publish_metadata.tsv
[STOP] [2019-12-23 18:51:46] complete_harvest_instance
[START] [2019-12-23 18:51:46] completed
[STOP] [2019-12-23 18:51:46] completed
[STOP] [2019-12-23 18:51:46] logged process, took 1461.65

Latest Process