Harvest for Ecuador Species List Created 12 Oct 15:14

Stage: completed
Fetched: 12 Oct 15:14
Validated: 12 Oct 15:14
Deltas Created 12 Oct 15:14
Units Normalized: 12 Oct 16:16
Ancestry Built: 12 Oct 15:25
Nodes Matched: 12 Oct 16:14
Names Parsed: 12 Oct 15:25
New Models Stored: 12 Oct 15:20
Indexed: 12 Oct 16:16
Completed: 12 Oct 16:29
Time to Harvest: 1 minute

Harvesting Log

(171 lines)
# Logfile created on 2019-10-12 15:14:33 -0400 by logger.rb/56815
[START] [2019-10-12 15:14:33] logged process
[START] [2019-10-12 15:14:33] create_harvest_instance
[STOP] [2019-10-12 15:14:34] create_harvest_instance
[START] [2019-10-12 15:14:34] fetch_files
[STOP] [2019-10-12 15:14:34] fetch_files
[START] [2019-10-12 15:14:34] validate_each_file
[STOP] [2019-10-12 15:14:40] validate_each_file
[START] [2019-10-12 15:14:40] convert_to_csv
[CMD] [2019-10-12 15:14:40] /usr/bin/sort /app/public/converted_csv/ecuador_sp_list_refs_15635.csv > /app/public/converted_csv/ecuador_sp_list_refs_15635.csv_sorted
[CMD] [2019-10-12 15:14:40] /usr/bin/sort /app/public/converted_csv/ecuador_sp_list_nodes_15636.csv > /app/public/converted_csv/ecuador_sp_list_nodes_15636.csv_sorted
[CMD] [2019-10-12 15:14:40] /usr/bin/sort /app/public/converted_csv/ecuador_sp_list_occurrences_15637.csv > /app/public/converted_csv/ecuador_sp_list_occurrences_15637.csv_sorted
[CMD] [2019-10-12 15:14:40] /usr/bin/sort /app/public/converted_csv/ecuador_sp_list_measurements_15638.csv > /app/public/converted_csv/ecuador_sp_list_measurements_15638.csv_sorted
[STOP] [2019-10-12 15:14:40] convert_to_csv
[START] [2019-10-12 15:14:40] calculate_delta
[CMD] [2019-10-12 15:14:40] echo "0a" > /app/public/diff/ecuador_sp_list_refs_15635.diff
[CMD] [2019-10-12 15:14:41] tail -n +1 /app/public/converted_csv/ecuador_sp_list_refs_15635.csv >> /app/public/diff/ecuador_sp_list_refs_15635.diff
[CMD] [2019-10-12 15:14:41] echo "." >> /app/public/diff/ecuador_sp_list_refs_15635.diff
[CMD] [2019-10-12 15:14:41] echo "0a" > /app/public/diff/ecuador_sp_list_nodes_15636.diff
[CMD] [2019-10-12 15:14:41] tail -n +1 /app/public/converted_csv/ecuador_sp_list_nodes_15636.csv >> /app/public/diff/ecuador_sp_list_nodes_15636.diff
[CMD] [2019-10-12 15:14:41] echo "." >> /app/public/diff/ecuador_sp_list_nodes_15636.diff
[CMD] [2019-10-12 15:14:41] echo "0a" > /app/public/diff/ecuador_sp_list_occurrences_15637.diff
[CMD] [2019-10-12 15:14:41] tail -n +1 /app/public/converted_csv/ecuador_sp_list_occurrences_15637.csv >> /app/public/diff/ecuador_sp_list_occurrences_15637.diff
[CMD] [2019-10-12 15:14:41] echo "." >> /app/public/diff/ecuador_sp_list_occurrences_15637.diff
[CMD] [2019-10-12 15:14:41] echo "0a" > /app/public/diff/ecuador_sp_list_measurements_15638.diff
[CMD] [2019-10-12 15:14:41] tail -n +1 /app/public/converted_csv/ecuador_sp_list_measurements_15638.csv >> /app/public/diff/ecuador_sp_list_measurements_15638.diff
[CMD] [2019-10-12 15:14:41] echo "." >> /app/public/diff/ecuador_sp_list_measurements_15638.diff
[STOP] [2019-10-12 15:14:42] calculate_delta
[START] [2019-10-12 15:14:42] parse_diff_and_store
[INFO] [2019-10-12 15:14:42] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-12 15:14:42] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-12 15:14:59] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-12 15:15:04] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-12 15:18:35] Storing 2 References
[INFO] [2019-10-12 15:18:35] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-12 15:18:35] Average Time: 0.0
[INFO] [2019-10-12 15:18:35] Total Time: 1s
[INFO] [2019-10-12 15:18:35] Storing 45714 ScientificNames
[INFO] [2019-10-12 15:18:35] Processing group of 45714 in 46 groups of 1000
[INFO] [2019-10-12 15:18:53] Average Time: 0.397
[INFO] [2019-10-12 15:18:53] Total Time: 19s
[INFO] [2019-10-12 15:18:53] last 3 / first 3: 0.93
[INFO] [2019-10-12 15:18:53] Std.Dev: 0.12649110640673517; Max: 0.93
[INFO] [2019-10-12 15:18:53] Storing 45714 Nodes
[INFO] [2019-10-12 15:18:53] Processing group of 45714 in 46 groups of 1000
[INFO] [2019-10-12 15:19:07] Average Time: 0.308
[INFO] [2019-10-12 15:19:07] Total Time: 15s
[INFO] [2019-10-12 15:19:07] last 3 / first 3: 0.97
[INFO] [2019-10-12 15:19:07] Std.Dev: 0.03162277660168379; Max: 0.42
[INFO] [2019-10-12 15:19:07] Storing 35942 Occurrences
[INFO] [2019-10-12 15:19:07] Processing group of 35942 in 36 groups of 1000
[INFO] [2019-10-12 15:19:14] Average Time: 0.169
[INFO] [2019-10-12 15:19:14] Total Time: 7s
[INFO] [2019-10-12 15:19:14] last 3 / first 3: 1.16
[INFO] [2019-10-12 15:19:14] Std.Dev: 0.2345207879911715; Max: 1.52
[INFO] [2019-10-12 15:19:14] Storing 71872 TraitsReferences
[INFO] [2019-10-12 15:19:14] Processing group of 71872 in 72 groups of 1000
[INFO] [2019-10-12 15:19:21] Average Time: 0.097
[INFO] [2019-10-12 15:19:21] Total Time: 8s
[INFO] [2019-10-12 15:19:21] last 3 / first 3: 0.68
[INFO] [2019-10-12 15:19:21] Std.Dev: 0.23874672772626646; Max: 2.09
[INFO] [2019-10-12 15:19:21] Storing 71871 Traits
[INFO] [2019-10-12 15:19:21] Processing group of 71871 in 72 groups of 1000
[INFO] [2019-10-12 15:19:48] Average Time: 0.374
[INFO] [2019-10-12 15:19:48] Total Time: 28s
[INFO] [2019-10-12 15:19:48] last 3 / first 3: 0.93
[INFO] [2019-10-12 15:19:48] Std.Dev: 0.32710854467592254; Max: 2.53
[INFO] [2019-10-12 15:19:48] Storing 71831 MetaTraits
[INFO] [2019-10-12 15:19:48] Processing group of 71831 in 72 groups of 1000
[INFO] [2019-10-12 15:20:03] Average Time: 0.195
[INFO] [2019-10-12 15:20:03] Total Time: 15s
[INFO] [2019-10-12 15:20:03] last 3 / first 3: 0.77
[INFO] [2019-10-12 15:20:03] Std.Dev: 0.3346640106136302; Max: 2.41
[STOP] [2019-10-12 15:20:03] parse_diff_and_store
[START] [2019-10-12 15:20:03] resolve_keys
[INFO] [2019-10-12 15:22:11] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-12 15:22:19] traits to occurrences...
[INFO] [2019-10-12 15:22:25] traits to nodes (through occurrences)...
[INFO] [2019-10-12 15:22:26] Traits to sex term...
[INFO] [2019-10-12 15:22:33] Traits to lifestage term...
[INFO] [2019-10-12 15:22:39] MetaTraits to traits...
[INFO] [2019-10-12 15:22:44] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-12 15:22:54] Assocs to occurrences...
[INFO] [2019-10-12 15:22:54] Assocs to nodes...
[INFO] [2019-10-12 15:22:54] Assoc to sex term...
[INFO] [2019-10-12 15:22:54] Assoc to lifestage term...
[STOP] [2019-10-12 15:22:54] resolve_keys
[START] [2019-10-12 15:22:54] hold_for_later_1
[STOP] [2019-10-12 15:22:54] hold_for_later_1
[START] [2019-10-12 15:22:54] hold_for_later_2
[STOP] [2019-10-12 15:22:54] hold_for_later_2
[START] [2019-10-12 15:22:54] resolve_missing_parents
[STOP] [2019-10-12 15:24:08] resolve_missing_parents
[START] [2019-10-12 15:24:08] rebuild_nodes
[START] [2019-10-12 15:24:08] Flattener#flatten
[START] [2019-10-12 15:24:08] Flattener#study_resource
[START] [2019-10-12 15:24:08] Flattener#build_ancestry
[STOP] [2019-10-12 15:24:22] Flattener#build_ancestry
[INFO] [2019-10-12 15:24:22] 45714 ancestry keys
[START] [2019-10-12 15:24:22] build_node_ancestors
[INFO] [2019-10-12 15:24:22] old ancestors deleted.
[STOP] [2019-10-12 15:24:59] build_node_ancestors
[START] [2019-10-12 15:25:04] Flattener#propagate_ancestor_ids
[STOP] [2019-10-12 15:25:10] Flattener#propagate_ancestor_ids
[STOP] [2019-10-12 15:25:10] Flattener#flatten
[STOP] [2019-10-12 15:25:10] rebuild_nodes
[START] [2019-10-12 15:25:10] resolve_missing_media_owners
[STOP] [2019-10-12 15:25:10] resolve_missing_media_owners
[START] [2019-10-12 15:25:10] sanitize_media_verbatims
[STOP] [2019-10-12 15:25:10] sanitize_media_verbatims
[START] [2019-10-12 15:25:10] queue_downloads
[STOP] [2019-10-12 15:25:10] queue_downloads
[START] [2019-10-12 15:25:10] parse_names
[WARN] [2019-10-12 15:25:11] I see 45714 names which still need to be parsed.
[STOP] [2019-10-12 15:25:46] parse_names
[START] [2019-10-12 15:25:46] denormalize_canonical_names_to_nodes
[STOP] [2019-10-12 15:25:47] denormalize_canonical_names_to_nodes
[START] [2019-10-12 15:25:47] match_nodes
[START] [2019-10-12 15:25:47] map_all_nodes_to_pages
[STOP] [2019-10-12 16:14:26] map_all_nodes_to_pages
[INFO] [2019-10-12 16:14:26] 2598 Unmatched nodes (of 45714)! That's too many to output. First 10: Aspidochirotida (#49474925); Pelagothuridae (#49498739); Deima (#49515394); Deima pacificum (#49515393); Cucumaria pacificum (#49519239); Dactylochirotida (#49508758); Pentaceraster occidentalis (#49491826); Cryptopeltaster (#49513263); Poraniopsis mirus (#49506995); Poraniopsis inflatus (#49512872)
[START] [2019-10-12 16:14:26] update_nodes
[STOP] [2019-10-12 16:14:42] update_nodes
[STOP] [2019-10-12 16:14:42] match_nodes
[START] [2019-10-12 16:14:42] reindex_search
[STOP] [2019-10-12 16:16:35] reindex_search
[START] [2019-10-12 16:16:35] normalize_units
[STOP] [2019-10-12 16:16:35] normalize_units
[START] [2019-10-12 16:16:35] calculate_statistics
[STOP] [2019-10-12 16:16:35] calculate_statistics
[START] [2019-10-12 16:16:35] complete_harvest_instance
[START] [2019-10-12 16:16:35] overall_tsv_creation
[INFO] [2019-10-12 16:16:35] Processing group of 45714 in 5 batches of 10000
[INFO] [2019-10-12 16:18:06] 6783 Traits (unfiltered)...
[INFO] [2019-10-12 16:18:20] 6783 Traits (filtered)...
[INFO] [2019-10-12 16:18:20] 0 Associations (filtered)...
[INFO] [2019-10-12 16:19:10] 33914 metadata added.
[INFO] [2019-10-12 16:19:10] 0 metadata added.
[INFO] [2019-10-12 16:20:46] 7942 Traits (unfiltered)...
[INFO] [2019-10-12 16:21:00] 7942 Traits (filtered)...
[INFO] [2019-10-12 16:21:00] 0 Associations (filtered)...
[INFO] [2019-10-12 16:22:00] 39703 metadata added.
[INFO] [2019-10-12 16:22:00] 0 metadata added.
[INFO] [2019-10-12 16:23:35] 8159 Traits (unfiltered)...
[INFO] [2019-10-12 16:23:49] 8159 Traits (filtered)...
[INFO] [2019-10-12 16:23:49] 0 Associations (filtered)...
[INFO] [2019-10-12 16:24:46] 40785 metadata added.
[INFO] [2019-10-12 16:24:46] 0 metadata added.
[INFO] [2019-10-12 16:26:25] 8065 Traits (unfiltered)...
[INFO] [2019-10-12 16:26:39] 8065 Traits (filtered)...
[INFO] [2019-10-12 16:26:39] 0 Associations (filtered)...
[INFO] [2019-10-12 16:27:35] 40312 metadata added.
[INFO] [2019-10-12 16:27:35] 0 metadata added.
[INFO] [2019-10-12 16:28:50] 4987 Traits (unfiltered)...
[INFO] [2019-10-12 16:29:03] 4987 Traits (filtered)...
[INFO] [2019-10-12 16:29:03] 0 Associations (filtered)...
[INFO] [2019-10-12 16:29:53] 24924 metadata added.
[INFO] [2019-10-12 16:29:53] 0 metadata added.
[INFO] [2019-10-12 16:29:53] Average Time: 132.338
[INFO] [2019-10-12 16:29:53] Total Time: 13m18s
[STOP] [2019-10-12 16:29:53] overall_tsv_creation
[INFO] [2019-10-12 16:29:53] Done. Check your files:
[INFO] [2019-10-12 16:29:53] (45714 lines) /app/public/data/ecuador_sp_list/publish_nodes.tsv
[INFO] [2019-10-12 16:29:53] (260180 lines) /app/public/data/ecuador_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-12 16:29:53] (45714 lines) /app/public/data/ecuador_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-12 16:29:53] (35937 lines) /app/public/data/ecuador_sp_list/publish_traits.tsv
[INFO] [2019-10-12 16:29:53] (179639 lines) /app/public/data/ecuador_sp_list/publish_metadata.tsv
[STOP] [2019-10-12 16:29:54] complete_harvest_instance
[START] [2019-10-12 16:29:54] completed
[STOP] [2019-10-12 16:29:54] completed
[STOP] [2019-10-12 16:29:54] logged process, took 4520.39

Latest Process