Harvest for Japan Species List Created 31 Oct 20:09

Stage: completed
Fetched: 31 Oct 20:09
Validated: 31 Oct 20:09
Deltas Created 31 Oct 20:09
Units Normalized: 31 Oct 22:03
Ancestry Built: 31 Oct 20:19
Nodes Matched: 31 Oct 22:01
Names Parsed: 31 Oct 20:20
New Models Stored: 31 Oct 20:14
Indexed: 31 Oct 22:03
Completed: 31 Oct 22:17
Time to Harvest: 2 minutes

Harvesting Log

(172 lines)
# Logfile created on 2019-10-31 20:09:25 -0400 by logger.rb/56815
[START] [2019-10-31 20:09:25] logged process
[START] [2019-10-31 20:09:25] create_harvest_instance
[STOP] [2019-10-31 20:09:25] create_harvest_instance
[START] [2019-10-31 20:09:25] fetch_files
[STOP] [2019-10-31 20:09:25] fetch_files
[START] [2019-10-31 20:09:25] validate_each_file
[STOP] [2019-10-31 20:09:31] validate_each_file
[START] [2019-10-31 20:09:31] convert_to_csv
[CMD] [2019-10-31 20:09:31] /usr/bin/sort /app/public/converted_csv/japan_sp_list_refs_17968.csv > /app/public/converted_csv/japan_sp_list_refs_17968.csv_sorted
[CMD] [2019-10-31 20:09:32] /usr/bin/sort /app/public/converted_csv/japan_sp_list_nodes_17969.csv > /app/public/converted_csv/japan_sp_list_nodes_17969.csv_sorted
[CMD] [2019-10-31 20:09:32] /usr/bin/sort /app/public/converted_csv/japan_sp_list_occurrences_17970.csv > /app/public/converted_csv/japan_sp_list_occurrences_17970.csv_sorted
[CMD] [2019-10-31 20:09:32] /usr/bin/sort /app/public/converted_csv/japan_sp_list_measurements_17971.csv > /app/public/converted_csv/japan_sp_list_measurements_17971.csv_sorted
[STOP] [2019-10-31 20:09:32] convert_to_csv
[START] [2019-10-31 20:09:32] calculate_delta
[CMD] [2019-10-31 20:09:32] echo "0a" > /app/public/diff/japan_sp_list_refs_17968.diff
[CMD] [2019-10-31 20:09:33] tail -n +1 /app/public/converted_csv/japan_sp_list_refs_17968.csv >> /app/public/diff/japan_sp_list_refs_17968.diff
[CMD] [2019-10-31 20:09:33] echo "." >> /app/public/diff/japan_sp_list_refs_17968.diff
[CMD] [2019-10-31 20:09:33] echo "0a" > /app/public/diff/japan_sp_list_nodes_17969.diff
[CMD] [2019-10-31 20:09:33] tail -n +1 /app/public/converted_csv/japan_sp_list_nodes_17969.csv >> /app/public/diff/japan_sp_list_nodes_17969.diff
[CMD] [2019-10-31 20:09:33] echo "." >> /app/public/diff/japan_sp_list_nodes_17969.diff
[CMD] [2019-10-31 20:09:34] echo "0a" > /app/public/diff/japan_sp_list_occurrences_17970.diff
[CMD] [2019-10-31 20:09:34] tail -n +1 /app/public/converted_csv/japan_sp_list_occurrences_17970.csv >> /app/public/diff/japan_sp_list_occurrences_17970.diff
[CMD] [2019-10-31 20:09:34] echo "." >> /app/public/diff/japan_sp_list_occurrences_17970.diff
[CMD] [2019-10-31 20:09:34] echo "0a" > /app/public/diff/japan_sp_list_measurements_17971.diff
[CMD] [2019-10-31 20:09:34] tail -n +1 /app/public/converted_csv/japan_sp_list_measurements_17971.csv >> /app/public/diff/japan_sp_list_measurements_17971.diff
[CMD] [2019-10-31 20:09:35] echo "." >> /app/public/diff/japan_sp_list_measurements_17971.diff
[STOP] [2019-10-31 20:09:35] calculate_delta
[START] [2019-10-31 20:09:35] parse_diff_and_store
[INFO] [2019-10-31 20:09:35] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-31 20:09:35] Loading nodes diff file into memory (true lines)...
[WARN] [2019-10-31 20:09:42] Filtered Scientific Name `Orosanga  japonicus` to `Orosanga japonicus`
[INFO] [2019-10-31 20:09:54] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-31 20:09:59] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-31 20:13:02] Storing 2 References
[INFO] [2019-10-31 20:13:02] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-31 20:13:03] Average Time: 0.0
[INFO] [2019-10-31 20:13:03] Total Time: 1s
[INFO] [2019-10-31 20:13:03] Storing 49819 ScientificNames
[INFO] [2019-10-31 20:13:03] Processing group of 49819 in 50 groups of 1000
[INFO] [2019-10-31 20:13:22] Average Time: 0.391
[INFO] [2019-10-31 20:13:22] Total Time: 20s
[INFO] [2019-10-31 20:13:22] last 3 / first 3: 1.0
[INFO] [2019-10-31 20:13:22] Std.Dev: 0.1341640786499874; Max: 1.07
[INFO] [2019-10-31 20:13:22] Storing 49819 Nodes
[INFO] [2019-10-31 20:13:22] Processing group of 49819 in 50 groups of 1000
[INFO] [2019-10-31 20:13:40] Average Time: 0.342
[INFO] [2019-10-31 20:13:40] Total Time: 18s
[INFO] [2019-10-31 20:13:40] last 3 / first 3: 1.19
[INFO] [2019-10-31 20:13:40] Std.Dev: 0.1341640786499874; Max: 1.04
[INFO] [2019-10-31 20:13:40] Storing 32757 Occurrences
[INFO] [2019-10-31 20:13:40] Processing group of 32757 in 33 groups of 1000
[INFO] [2019-10-31 20:13:43] Average Time: 0.105
[INFO] [2019-10-31 20:13:43] Total Time: 4s
[INFO] [2019-10-31 20:13:43] last 3 / first 3: 1.41
[INFO] [2019-10-31 20:13:43] Std.Dev: 0.0; Max: 0.19
[INFO] [2019-10-31 20:13:43] Storing 65514 TraitsReferences
[INFO] [2019-10-31 20:13:43] Processing group of 65514 in 66 groups of 1000
[INFO] [2019-10-31 20:13:49] Average Time: 0.084
[INFO] [2019-10-31 20:13:49] Total Time: 6s
[INFO] [2019-10-31 20:13:49] last 3 / first 3: 1.04
[INFO] [2019-10-31 20:13:49] Std.Dev: 0.03162277660168379; Max: 0.18
[INFO] [2019-10-31 20:13:49] Storing 65514 Traits
[INFO] [2019-10-31 20:13:49] Processing group of 65514 in 66 groups of 1000
[INFO] [2019-10-31 20:14:14] Average Time: 0.378
[INFO] [2019-10-31 20:14:14] Total Time: 26s
[INFO] [2019-10-31 20:14:14] last 3 / first 3: 0.31
[INFO] [2019-10-31 20:14:14] Std.Dev: 0.27568097504180444; Max: 1.96
[INFO] [2019-10-31 20:14:14] Storing 65387 MetaTraits
[INFO] [2019-10-31 20:14:14] Processing group of 65387 in 66 groups of 1000
[INFO] [2019-10-31 20:14:22] Average Time: 0.119
[INFO] [2019-10-31 20:14:22] Total Time: 9s
[INFO] [2019-10-31 20:14:22] last 3 / first 3: 1.83
[INFO] [2019-10-31 20:14:22] Std.Dev: 0.03162277660168379; Max: 0.29
[STOP] [2019-10-31 20:14:22] parse_diff_and_store
[START] [2019-10-31 20:14:22] resolve_keys
[INFO] [2019-10-31 20:16:27] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-31 20:16:34] traits to occurrences...
[INFO] [2019-10-31 20:16:41] traits to nodes (through occurrences)...
[INFO] [2019-10-31 20:16:42] Traits to sex term...
[INFO] [2019-10-31 20:16:47] Traits to lifestage term...
[INFO] [2019-10-31 20:16:53] MetaTraits to traits...
[INFO] [2019-10-31 20:16:57] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-31 20:17:05] Assocs to occurrences...
[INFO] [2019-10-31 20:17:05] Assocs to nodes...
[INFO] [2019-10-31 20:17:05] Assoc to sex term...
[INFO] [2019-10-31 20:17:05] Assoc to lifestage term...
[STOP] [2019-10-31 20:17:05] resolve_keys
[START] [2019-10-31 20:17:06] hold_for_later_1
[STOP] [2019-10-31 20:17:06] hold_for_later_1
[START] [2019-10-31 20:17:06] hold_for_later_2
[STOP] [2019-10-31 20:17:06] hold_for_later_2
[START] [2019-10-31 20:17:06] resolve_missing_parents
[STOP] [2019-10-31 20:18:14] resolve_missing_parents
[START] [2019-10-31 20:18:14] rebuild_nodes
[START] [2019-10-31 20:18:14] Flattener#flatten
[START] [2019-10-31 20:18:14] Flattener#study_resource
[START] [2019-10-31 20:18:14] Flattener#build_ancestry
[STOP] [2019-10-31 20:18:30] Flattener#build_ancestry
[INFO] [2019-10-31 20:18:30] 49819 ancestry keys
[START] [2019-10-31 20:18:30] build_node_ancestors
[INFO] [2019-10-31 20:18:30] old ancestors deleted.
[STOP] [2019-10-31 20:19:08] build_node_ancestors
[START] [2019-10-31 20:19:13] Flattener#propagate_ancestor_ids
[STOP] [2019-10-31 20:19:21] Flattener#propagate_ancestor_ids
[STOP] [2019-10-31 20:19:21] Flattener#flatten
[STOP] [2019-10-31 20:19:21] rebuild_nodes
[START] [2019-10-31 20:19:21] resolve_missing_media_owners
[STOP] [2019-10-31 20:19:21] resolve_missing_media_owners
[START] [2019-10-31 20:19:21] sanitize_media_verbatims
[STOP] [2019-10-31 20:19:21] sanitize_media_verbatims
[START] [2019-10-31 20:19:21] queue_downloads
[STOP] [2019-10-31 20:19:21] queue_downloads
[START] [2019-10-31 20:19:21] parse_names
[WARN] [2019-10-31 20:19:21] I see 49819 names which still need to be parsed.
[STOP] [2019-10-31 20:20:00] parse_names
[START] [2019-10-31 20:20:00] denormalize_canonical_names_to_nodes
[STOP] [2019-10-31 20:20:00] denormalize_canonical_names_to_nodes
[START] [2019-10-31 20:20:00] match_nodes
[START] [2019-10-31 20:20:01] map_all_nodes_to_pages
[STOP] [2019-10-31 22:01:21] map_all_nodes_to_pages
[INFO] [2019-10-31 22:01:21] 4460 Unmatched nodes (of 49819)! That's too many to output. First 10: Geotrupes auratus (#54024191); Lebia thermoides (#54061241); Dyschirius ovicollis (#54031117); Cicindela scutelaris (#54030062); Amara congruus (#54035648); Amara prolongatus (#54047567); Bembidion transbaicalica (#54031433); Bembidion aenipes (#54036124); Micratopus discus (#54025235); Craspedonotus tibitalis (#54027539)
[START] [2019-10-31 22:01:21] update_nodes
[STOP] [2019-10-31 22:01:38] update_nodes
[STOP] [2019-10-31 22:01:38] match_nodes
[START] [2019-10-31 22:01:38] reindex_search
[STOP] [2019-10-31 22:03:40] reindex_search
[START] [2019-10-31 22:03:40] normalize_units
[STOP] [2019-10-31 22:03:40] normalize_units
[START] [2019-10-31 22:03:40] calculate_statistics
[STOP] [2019-10-31 22:03:41] calculate_statistics
[START] [2019-10-31 22:03:41] complete_harvest_instance
[START] [2019-10-31 22:03:41] overall_tsv_creation
[INFO] [2019-10-31 22:03:41] Processing group of 49819 in 5 batches of 10000
[INFO] [2019-10-31 22:05:23] 5592 Traits (unfiltered)...
[INFO] [2019-10-31 22:05:36] 5592 Traits (filtered)...
[INFO] [2019-10-31 22:05:36] 0 Associations (filtered)...
[INFO] [2019-10-31 22:06:24] 27946 metadata added.
[INFO] [2019-10-31 22:06:24] 0 metadata added.
[INFO] [2019-10-31 22:07:56] 6292 Traits (unfiltered)...
[INFO] [2019-10-31 22:08:09] 6292 Traits (filtered)...
[INFO] [2019-10-31 22:08:09] 0 Associations (filtered)...
[INFO] [2019-10-31 22:08:59] 31437 metadata added.
[INFO] [2019-10-31 22:08:59] 0 metadata added.
[INFO] [2019-10-31 22:10:33] 6685 Traits (unfiltered)...
[INFO] [2019-10-31 22:10:46] 6685 Traits (filtered)...
[INFO] [2019-10-31 22:10:46] 0 Associations (filtered)...
[INFO] [2019-10-31 22:11:39] 33400 metadata added.
[INFO] [2019-10-31 22:11:39] 0 metadata added.
[INFO] [2019-10-31 22:13:14] 6964 Traits (unfiltered)...
[INFO] [2019-10-31 22:13:28] 6964 Traits (filtered)...
[INFO] [2019-10-31 22:13:28] 0 Associations (filtered)...
[INFO] [2019-10-31 22:14:21] 34791 metadata added.
[INFO] [2019-10-31 22:14:21] 0 metadata added.
[INFO] [2019-10-31 22:15:54] 7224 Traits (unfiltered)...
[INFO] [2019-10-31 22:16:07] 7224 Traits (filtered)...
[INFO] [2019-10-31 22:16:07] 0 Associations (filtered)...
[INFO] [2019-10-31 22:17:00] 36084 metadata added.
[INFO] [2019-10-31 22:17:00] 0 metadata added.
[INFO] [2019-10-31 22:17:00] Average Time: 128.186
[INFO] [2019-10-31 22:17:00] Total Time: 13m20s
[STOP] [2019-10-31 22:17:00] overall_tsv_creation
[INFO] [2019-10-31 22:17:00] Done. Check your files:
[INFO] [2019-10-31 22:17:01] (49819 lines) /app/public/data/japan_sp_list/publish_nodes.tsv
[INFO] [2019-10-31 22:17:01] (273066 lines) /app/public/data/japan_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-31 22:17:01] (49819 lines) /app/public/data/japan_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-31 22:17:01] (32758 lines) /app/public/data/japan_sp_list/publish_traits.tsv
[INFO] [2019-10-31 22:17:01] (163659 lines) /app/public/data/japan_sp_list/publish_metadata.tsv
[STOP] [2019-10-31 22:17:02] complete_harvest_instance
[START] [2019-10-31 22:17:02] completed
[STOP] [2019-10-31 22:17:02] completed
[STOP] [2019-10-31 22:17:02] logged process, took 7656.64

Latest Process