Harvest for China Species List Created 12 Oct 03:17

Stage: completed
Fetched: 12 Oct 03:17
Validated: 12 Oct 03:17
Deltas Created 12 Oct 03:17
Units Normalized: 12 Oct 05:12
Ancestry Built: 12 Oct 03:34
Nodes Matched: 12 Oct 05:09
Names Parsed: 12 Oct 03:35
New Models Stored: 12 Oct 03:26
Indexed: 12 Oct 05:12
Completed: 12 Oct 05:35
Time to Harvest: 2 minutes

Harvesting Log

(198 lines)
# Logfile created on 2019-10-12 03:17:03 -0400 by logger.rb/56815
[START] [2019-10-12 03:17:03] logged process
[START] [2019-10-12 03:17:03] create_harvest_instance
[STOP] [2019-10-12 03:17:04] create_harvest_instance
[START] [2019-10-12 03:17:04] fetch_files
[STOP] [2019-10-12 03:17:04] fetch_files
[START] [2019-10-12 03:17:04] validate_each_file
[STOP] [2019-10-12 03:17:14] validate_each_file
[START] [2019-10-12 03:17:14] convert_to_csv
[CMD] [2019-10-12 03:17:14] /usr/bin/sort /app/public/converted_csv/china_sp_list_refs_15427.csv > /app/public/converted_csv/china_sp_list_refs_15427.csv_sorted
[CMD] [2019-10-12 03:17:14] /usr/bin/sort /app/public/converted_csv/china_sp_list_nodes_15428.csv > /app/public/converted_csv/china_sp_list_nodes_15428.csv_sorted
[CMD] [2019-10-12 03:17:14] /usr/bin/sort /app/public/converted_csv/china_sp_list_occurrences_15429.csv > /app/public/converted_csv/china_sp_list_occurrences_15429.csv_sorted
[CMD] [2019-10-12 03:17:14] /usr/bin/sort /app/public/converted_csv/china_sp_list_measurements_15430.csv > /app/public/converted_csv/china_sp_list_measurements_15430.csv_sorted
[STOP] [2019-10-12 03:17:14] convert_to_csv
[START] [2019-10-12 03:17:14] calculate_delta
[CMD] [2019-10-12 03:17:14] echo "0a" > /app/public/diff/china_sp_list_refs_15427.diff
[CMD] [2019-10-12 03:17:14] tail -n +1 /app/public/converted_csv/china_sp_list_refs_15427.csv >> /app/public/diff/china_sp_list_refs_15427.diff
[CMD] [2019-10-12 03:17:14] echo "." >> /app/public/diff/china_sp_list_refs_15427.diff
[CMD] [2019-10-12 03:17:15] echo "0a" > /app/public/diff/china_sp_list_nodes_15428.diff
[CMD] [2019-10-12 03:17:15] tail -n +1 /app/public/converted_csv/china_sp_list_nodes_15428.csv >> /app/public/diff/china_sp_list_nodes_15428.diff
[CMD] [2019-10-12 03:17:15] echo "." >> /app/public/diff/china_sp_list_nodes_15428.diff
[CMD] [2019-10-12 03:17:15] echo "0a" > /app/public/diff/china_sp_list_occurrences_15429.diff
[CMD] [2019-10-12 03:17:15] tail -n +1 /app/public/converted_csv/china_sp_list_occurrences_15429.csv >> /app/public/diff/china_sp_list_occurrences_15429.diff
[CMD] [2019-10-12 03:17:15] echo "." >> /app/public/diff/china_sp_list_occurrences_15429.diff
[CMD] [2019-10-12 03:17:15] echo "0a" > /app/public/diff/china_sp_list_measurements_15430.diff
[CMD] [2019-10-12 03:17:15] tail -n +1 /app/public/converted_csv/china_sp_list_measurements_15430.csv >> /app/public/diff/china_sp_list_measurements_15430.diff
[CMD] [2019-10-12 03:17:15] echo "." >> /app/public/diff/china_sp_list_measurements_15430.diff
[STOP] [2019-10-12 03:17:15] calculate_delta
[START] [2019-10-12 03:17:15] parse_diff_and_store
[INFO] [2019-10-12 03:17:16] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-12 03:17:16] Loading nodes diff file into memory (true lines)...
[WARN] [2019-10-12 03:17:20] Filtered Scientific Name `Huechys  sanguinea` to `Huechys sanguinea`
[WARN] [2019-10-12 03:17:23] Filtered Scientific Name `Macromerella  honesta` to `Macromerella honesta`
[WARN] [2019-10-12 03:17:28] Filtered Scientific Name `Pamerana  scotti` to `Pamerana scotti`
[WARN] [2019-10-12 03:17:29] Filtered Scientific Name `Tetroda  histeroides` to `Tetroda histeroides`
[WARN] [2019-10-12 03:17:36] Filtered Scientific Name `Mogannia  nasalis` to `Mogannia nasalis`
[INFO] [2019-10-12 03:17:47] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-12 03:17:56] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-12 03:24:06] Storing 2 References
[INFO] [2019-10-12 03:24:06] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-12 03:24:06] Average Time: 0.0
[INFO] [2019-10-12 03:24:06] Total Time: 1s
[INFO] [2019-10-12 03:24:06] Storing 81510 ScientificNames
[INFO] [2019-10-12 03:24:06] Processing group of 81510 in 82 groups of 1000
[INFO] [2019-10-12 03:24:46] Average Time: 0.479
[INFO] [2019-10-12 03:24:46] Total Time: 40s
[INFO] [2019-10-12 03:24:46] last 3 / first 3: 1.16
[INFO] [2019-10-12 03:24:46] Std.Dev: 0.41952353926806063; Max: 3.21
[INFO] [2019-10-12 03:24:46] Storing 81510 Nodes
[INFO] [2019-10-12 03:24:46] Processing group of 81510 in 82 groups of 1000
[INFO] [2019-10-12 03:25:22] Average Time: 0.44
[INFO] [2019-10-12 03:25:22] Total Time: 37s
[INFO] [2019-10-12 03:25:22] last 3 / first 3: 0.39
[INFO] [2019-10-12 03:25:22] Std.Dev: 0.51478150704935; Max: 3.56
[INFO] [2019-10-12 03:25:22] Storing 61854 Occurrences
[INFO] [2019-10-12 03:25:22] Processing group of 61854 in 62 groups of 1000
[INFO] [2019-10-12 03:25:30] Average Time: 0.115
[INFO] [2019-10-12 03:25:30] Total Time: 8s
[INFO] [2019-10-12 03:25:30] last 3 / first 3: 1.2
[INFO] [2019-10-12 03:25:30] Std.Dev: 0.03162277660168379; Max: 0.23
[INFO] [2019-10-12 03:25:30] Storing 123708 TraitsReferences
[INFO] [2019-10-12 03:25:30] Processing group of 123708 in 124 groups of 1000
[INFO] [2019-10-12 03:25:46] Average Time: 0.128
[INFO] [2019-10-12 03:25:46] Total Time: 17s
[INFO] [2019-10-12 03:25:46] last 3 / first 3: 0.7
[INFO] [2019-10-12 03:25:46] Std.Dev: 0.4207136793592526; Max: 3.54
[INFO] [2019-10-12 03:25:46] Storing 123708 Traits
[INFO] [2019-10-12 03:25:46] Processing group of 123708 in 124 groups of 1000
[INFO] [2019-10-12 03:26:33] Average Time: 0.372
[INFO] [2019-10-12 03:26:33] Total Time: 47s
[INFO] [2019-10-12 03:26:33] last 3 / first 3: 1.49
[INFO] [2019-10-12 03:26:33] Std.Dev: 0.4404543109109048; Max: 3.98
[INFO] [2019-10-12 03:26:33] Storing 123561 MetaTraits
[INFO] [2019-10-12 03:26:33] Processing group of 123561 in 124 groups of 1000
[INFO] [2019-10-12 03:26:54] Average Time: 0.169
[INFO] [2019-10-12 03:26:54] Total Time: 22s
[INFO] [2019-10-12 03:26:54] last 3 / first 3: 0.1
[INFO] [2019-10-12 03:26:54] Std.Dev: 0.42544094772365293; Max: 4.07
[STOP] [2019-10-12 03:26:54] parse_diff_and_store
[START] [2019-10-12 03:26:54] resolve_keys
[INFO] [2019-10-12 03:29:48] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-12 03:29:58] traits to occurrences...
[INFO] [2019-10-12 03:30:08] traits to nodes (through occurrences)...
[INFO] [2019-10-12 03:30:09] Traits to sex term...
[INFO] [2019-10-12 03:30:17] Traits to lifestage term...
[INFO] [2019-10-12 03:30:26] MetaTraits to traits...
[INFO] [2019-10-12 03:30:34] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-12 03:30:50] Assocs to occurrences...
[INFO] [2019-10-12 03:30:50] Assocs to nodes...
[INFO] [2019-10-12 03:30:50] Assoc to sex term...
[INFO] [2019-10-12 03:30:50] Assoc to lifestage term...
[STOP] [2019-10-12 03:30:50] resolve_keys
[START] [2019-10-12 03:30:50] hold_for_later_1
[STOP] [2019-10-12 03:30:50] hold_for_later_1
[START] [2019-10-12 03:30:50] hold_for_later_2
[STOP] [2019-10-12 03:30:50] hold_for_later_2
[START] [2019-10-12 03:30:50] resolve_missing_parents
[STOP] [2019-10-12 03:32:46] resolve_missing_parents
[START] [2019-10-12 03:32:46] rebuild_nodes
[START] [2019-10-12 03:32:46] Flattener#flatten
[START] [2019-10-12 03:32:46] Flattener#study_resource
[START] [2019-10-12 03:32:46] Flattener#build_ancestry
[STOP] [2019-10-12 03:33:05] Flattener#build_ancestry
[INFO] [2019-10-12 03:33:05] 81510 ancestry keys
[START] [2019-10-12 03:33:05] build_node_ancestors
[INFO] [2019-10-12 03:33:05] old ancestors deleted.
[STOP] [2019-10-12 03:34:09] build_node_ancestors
[START] [2019-10-12 03:34:13] Flattener#propagate_ancestor_ids
[STOP] [2019-10-12 03:34:28] Flattener#propagate_ancestor_ids
[STOP] [2019-10-12 03:34:28] Flattener#flatten
[STOP] [2019-10-12 03:34:28] rebuild_nodes
[START] [2019-10-12 03:34:28] resolve_missing_media_owners
[STOP] [2019-10-12 03:34:28] resolve_missing_media_owners
[START] [2019-10-12 03:34:28] sanitize_media_verbatims
[STOP] [2019-10-12 03:34:28] sanitize_media_verbatims
[START] [2019-10-12 03:34:28] queue_downloads
[STOP] [2019-10-12 03:34:28] queue_downloads
[START] [2019-10-12 03:34:28] parse_names
[WARN] [2019-10-12 03:34:28] I see 81510 names which still need to be parsed.
[STOP] [2019-10-12 03:35:35] parse_names
[START] [2019-10-12 03:35:35] denormalize_canonical_names_to_nodes
[STOP] [2019-10-12 03:35:36] denormalize_canonical_names_to_nodes
[START] [2019-10-12 03:35:36] match_nodes
[START] [2019-10-12 03:35:36] map_all_nodes_to_pages
[STOP] [2019-10-12 05:08:37] map_all_nodes_to_pages
[INFO] [2019-10-12 05:08:37] 6666 Unmatched nodes (of 81510)! That's too many to output. First 10: Peronospora farinosa (#49197767); Globigerinoides triloba (#49134057); Globigerinoides conglobata (#49140006); Globigerinoides trilobus (#49140434); Globigerina calida (#49142570); Globigerina rubescens (#49142954); Globigerina digitata (#49148029); Globoturborotalita tenellus (#49143651); Operculinella venosa (#49203797); Archaesphaeridae (#49197322)
[START] [2019-10-12 05:08:37] update_nodes
[STOP] [2019-10-12 05:09:09] update_nodes
[STOP] [2019-10-12 05:09:09] match_nodes
[START] [2019-10-12 05:09:09] reindex_search
[STOP] [2019-10-12 05:12:14] reindex_search
[START] [2019-10-12 05:12:15] normalize_units
[STOP] [2019-10-12 05:12:15] normalize_units
[START] [2019-10-12 05:12:15] calculate_statistics
[STOP] [2019-10-12 05:12:15] calculate_statistics
[START] [2019-10-12 05:12:15] complete_harvest_instance
[START] [2019-10-12 05:12:15] overall_tsv_creation
[INFO] [2019-10-12 05:12:15] Processing group of 81510 in 9 batches of 10000
[INFO] [2019-10-12 05:13:44] 6349 Traits (unfiltered)...
[INFO] [2019-10-12 05:13:58] 6349 Traits (filtered)...
[INFO] [2019-10-12 05:13:58] 0 Associations (filtered)...
[INFO] [2019-10-12 05:14:47] 31735 metadata added.
[INFO] [2019-10-12 05:14:47] 0 metadata added.
[INFO] [2019-10-12 05:16:21] 7526 Traits (unfiltered)...
[INFO] [2019-10-12 05:16:34] 7526 Traits (filtered)...
[INFO] [2019-10-12 05:16:35] 0 Associations (filtered)...
[INFO] [2019-10-12 05:17:27] 37618 metadata added.
[INFO] [2019-10-12 05:17:27] 0 metadata added.
[INFO] [2019-10-12 05:19:04] 7842 Traits (unfiltered)...
[INFO] [2019-10-12 05:19:18] 7842 Traits (filtered)...
[INFO] [2019-10-12 05:19:20] 0 Associations (filtered)...
[INFO] [2019-10-12 05:20:16] 39197 metadata added.
[INFO] [2019-10-12 05:20:16] 0 metadata added.
[INFO] [2019-10-12 05:21:51] 7838 Traits (unfiltered)...
[INFO] [2019-10-12 05:22:05] 7838 Traits (filtered)...
[INFO] [2019-10-12 05:22:05] 0 Associations (filtered)...
[INFO] [2019-10-12 05:23:01] 39169 metadata added.
[INFO] [2019-10-12 05:23:01] 0 metadata added.
[INFO] [2019-10-12 05:24:38] 7732 Traits (unfiltered)...
[INFO] [2019-10-12 05:24:52] 7732 Traits (filtered)...
[INFO] [2019-10-12 05:24:52] 0 Associations (filtered)...
[INFO] [2019-10-12 05:25:48] 38638 metadata added.
[INFO] [2019-10-12 05:25:48] 0 metadata added.
[INFO] [2019-10-12 05:27:25] 7945 Traits (unfiltered)...
[INFO] [2019-10-12 05:27:39] 7945 Traits (filtered)...
[INFO] [2019-10-12 05:27:39] 0 Associations (filtered)...
[INFO] [2019-10-12 05:28:34] 39712 metadata added.
[INFO] [2019-10-12 05:28:34] 0 metadata added.
[INFO] [2019-10-12 05:30:12] 7603 Traits (unfiltered)...
[INFO] [2019-10-12 05:30:26] 7603 Traits (filtered)...
[INFO] [2019-10-12 05:30:26] 0 Associations (filtered)...
[INFO] [2019-10-12 05:31:21] 37988 metadata added.
[INFO] [2019-10-12 05:31:21] 0 metadata added.
[INFO] [2019-10-12 05:32:57] 7838 Traits (unfiltered)...
[INFO] [2019-10-12 05:33:11] 7838 Traits (filtered)...
[INFO] [2019-10-12 05:33:11] 0 Associations (filtered)...
[INFO] [2019-10-12 05:34:07] 39165 metadata added.
[INFO] [2019-10-12 05:34:07] 0 metadata added.
[INFO] [2019-10-12 05:35:00] 1181 Traits (unfiltered)...
[INFO] [2019-10-12 05:35:14] 1181 Traits (filtered)...
[INFO] [2019-10-12 05:35:14] 0 Associations (filtered)...
[INFO] [2019-10-12 05:35:52] 5901 metadata added.
[INFO] [2019-10-12 05:35:52] 0 metadata added.
[INFO] [2019-10-12 05:35:52] Average Time: 129.778
[INFO] [2019-10-12 05:35:52] Total Time: 23m38s
[INFO] [2019-10-12 05:35:52] last 3 / first 3: 0.91
[INFO] [2019-10-12 05:35:52] Std.Dev: 17.403160632482827; Max: 138.14
[STOP] [2019-10-12 05:35:52] overall_tsv_creation
[INFO] [2019-10-12 05:35:52] Done. Check your files:
[INFO] [2019-10-12 05:35:52] (81510 lines) /app/public/data/china_sp_list/publish_nodes.tsv
[INFO] [2019-10-12 05:35:53] (456872 lines) /app/public/data/china_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-12 05:35:53] (81510 lines) /app/public/data/china_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-12 05:35:53] (61855 lines) /app/public/data/china_sp_list/publish_traits.tsv
[INFO] [2019-10-12 05:35:53] (309124 lines) /app/public/data/china_sp_list/publish_metadata.tsv
[STOP] [2019-10-12 05:35:53] complete_harvest_instance
[START] [2019-10-12 05:35:53] completed
[STOP] [2019-10-12 05:35:53] completed
[STOP] [2019-10-12 05:35:53] logged process, took 8330.23

Latest Process