Harvest for North Atlantic Species List Created 23 Dec 19:29

Stage: completed
Fetched: 23 Dec 19:29
Validated: 23 Dec 19:29
Deltas Created 23 Dec 19:29
Units Normalized: 24 Dec 02:03
Ancestry Built: 23 Dec 19:57
Nodes Matched: 24 Dec 01:58
Names Parsed: 23 Dec 19:59
New Models Stored: 23 Dec 19:44
Indexed: 24 Dec 02:03
Completed: 24 Dec 02:35
Time to Harvest: 7 minutes

Harvesting Log

(209 lines)
# Logfile created on 2019-12-23 19:29:20 -0500 by logger.rb/56815
[START] [2019-12-23 19:29:20] logged process
[START] [2019-12-23 19:29:20] create_harvest_instance
[STOP] [2019-12-23 19:29:21] create_harvest_instance
[START] [2019-12-23 19:29:21] fetch_files
[STOP] [2019-12-23 19:29:21] fetch_files
[START] [2019-12-23 19:29:21] validate_each_file
[STOP] [2019-12-23 19:29:37] validate_each_file
[START] [2019-12-23 19:29:37] convert_to_csv
[CMD] [2019-12-23 19:29:37] /usr/bin/sort /app/public/converted_csv/n_atlantic_sp_li_refs_19530.csv > /app/public/converted_csv/n_atlantic_sp_li_refs_19530.csv_sorted
[CMD] [2019-12-23 19:29:37] /usr/bin/sort /app/public/converted_csv/n_atlantic_sp_li_nodes_19531.csv > /app/public/converted_csv/n_atlantic_sp_li_nodes_19531.csv_sorted
[CMD] [2019-12-23 19:29:37] /usr/bin/sort /app/public/converted_csv/n_atlantic_sp_li_occurrences_19532.csv > /app/public/converted_csv/n_atlantic_sp_li_occurrences_19532.csv_sorted
[CMD] [2019-12-23 19:29:37] /usr/bin/sort /app/public/converted_csv/n_atlantic_sp_li_measurements_19533.csv > /app/public/converted_csv/n_atlantic_sp_li_measurements_19533.csv_sorted
[STOP] [2019-12-23 19:29:37] convert_to_csv
[START] [2019-12-23 19:29:37] calculate_delta
[CMD] [2019-12-23 19:29:37] echo "0a" > /app/public/diff/n_atlantic_sp_li_refs_19530.diff
[CMD] [2019-12-23 19:29:38] tail -n +1 /app/public/converted_csv/n_atlantic_sp_li_refs_19530.csv >> /app/public/diff/n_atlantic_sp_li_refs_19530.diff
[CMD] [2019-12-23 19:29:38] echo "." >> /app/public/diff/n_atlantic_sp_li_refs_19530.diff
[CMD] [2019-12-23 19:29:38] echo "0a" > /app/public/diff/n_atlantic_sp_li_nodes_19531.diff
[CMD] [2019-12-23 19:29:38] tail -n +1 /app/public/converted_csv/n_atlantic_sp_li_nodes_19531.csv >> /app/public/diff/n_atlantic_sp_li_nodes_19531.diff
[CMD] [2019-12-23 19:29:38] echo "." >> /app/public/diff/n_atlantic_sp_li_nodes_19531.diff
[CMD] [2019-12-23 19:29:38] echo "0a" > /app/public/diff/n_atlantic_sp_li_occurrences_19532.diff
[CMD] [2019-12-23 19:29:38] tail -n +1 /app/public/converted_csv/n_atlantic_sp_li_occurrences_19532.csv >> /app/public/diff/n_atlantic_sp_li_occurrences_19532.diff
[CMD] [2019-12-23 19:29:38] echo "." >> /app/public/diff/n_atlantic_sp_li_occurrences_19532.diff
[CMD] [2019-12-23 19:29:39] echo "0a" > /app/public/diff/n_atlantic_sp_li_measurements_19533.diff
[CMD] [2019-12-23 19:29:39] tail -n +1 /app/public/converted_csv/n_atlantic_sp_li_measurements_19533.csv >> /app/public/diff/n_atlantic_sp_li_measurements_19533.diff
[CMD] [2019-12-23 19:29:39] echo "." >> /app/public/diff/n_atlantic_sp_li_measurements_19533.diff
[STOP] [2019-12-23 19:29:39] calculate_delta
[START] [2019-12-23 19:29:39] parse_diff_and_store
[INFO] [2019-12-23 19:29:39] Loading refs diff file into memory (true lines)...
[INFO] [2019-12-23 19:29:39] Loading nodes diff file into memory (true lines)...
[WARN] [2019-12-23 19:30:03] Filtered Scientific Name `Megasyrphus  laxus` to `Megasyrphus laxus`
[INFO] [2019-12-23 19:30:26] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-12-23 19:30:39] Loading measurements diff file into memory (true lines)...
[INFO] [2019-12-23 19:39:29] Storing 2 References
[INFO] [2019-12-23 19:39:29] Processing group of 2 in 1 groups of 1000
[INFO] [2019-12-23 19:39:29] Average Time: 0.0
[INFO] [2019-12-23 19:39:29] Total Time: 1s
[INFO] [2019-12-23 19:39:29] Storing 115462 ScientificNames
[INFO] [2019-12-23 19:39:29] Processing group of 115462 in 116 groups of 1000
[INFO] [2019-12-23 19:40:37] Average Time: 0.581
[INFO] [2019-12-23 19:40:37] Total Time: 1m9s
[INFO] [2019-12-23 19:40:37] last 3 / first 3: 0.92
[INFO] [2019-12-23 19:40:37] Std.Dev: 0.7049822692805827; Max: 4.96
[INFO] [2019-12-23 19:40:37] Storing 115462 Nodes
[INFO] [2019-12-23 19:40:37] Processing group of 115462 in 116 groups of 1000
[INFO] [2019-12-23 19:41:40] Average Time: 0.537
[INFO] [2019-12-23 19:41:40] Total Time: 1m3s
[INFO] [2019-12-23 19:41:40] last 3 / first 3: 0.84
[INFO] [2019-12-23 19:41:40] Std.Dev: 0.906642156531451; Max: 5.52
[INFO] [2019-12-23 19:41:40] Storing 83854 Occurrences
[INFO] [2019-12-23 19:41:40] Processing group of 83854 in 84 groups of 1000
[INFO] [2019-12-23 19:41:58] Average Time: 0.206
[INFO] [2019-12-23 19:41:58] Total Time: 18s
[INFO] [2019-12-23 19:41:58] last 3 / first 3: 16.6
[INFO] [2019-12-23 19:41:58] Std.Dev: 0.5890670590009256; Max: 5.53
[INFO] [2019-12-23 19:41:58] Storing 167708 TraitsReferences
[INFO] [2019-12-23 19:41:58] Processing group of 167708 in 168 groups of 1000
[INFO] [2019-12-23 19:42:17] Average Time: 0.11
[INFO] [2019-12-23 19:42:17] Total Time: 20s
[INFO] [2019-12-23 19:42:17] last 3 / first 3: 0.59
[INFO] [2019-12-23 19:42:17] Std.Dev: 0.4074309757492673; Max: 5.35
[INFO] [2019-12-23 19:42:17] Storing 167708 Traits
[INFO] [2019-12-23 19:42:17] Processing group of 167708 in 168 groups of 1000
[INFO] [2019-12-23 19:43:53] Average Time: 0.567
[INFO] [2019-12-23 19:43:53] Total Time: 1m37s
[INFO] [2019-12-23 19:43:53] last 3 / first 3: 0.69
[INFO] [2019-12-23 19:43:53] Std.Dev: 1.0677078252031311; Max: 6.37
[INFO] [2019-12-23 19:43:53] Storing 167555 MetaTraits
[INFO] [2019-12-23 19:43:53] Processing group of 167555 in 168 groups of 1000
[INFO] [2019-12-23 19:44:43] Average Time: 0.291
[INFO] [2019-12-23 19:44:43] Total Time: 50s
[INFO] [2019-12-23 19:44:43] last 3 / first 3: 0.63
[INFO] [2019-12-23 19:44:43] Std.Dev: 0.9612491872558333; Max: 6.63
[STOP] [2019-12-23 19:44:43] parse_diff_and_store
[START] [2019-12-23 19:44:43] resolve_keys
[INFO] [2019-12-23 19:47:35] Occurrences to nodes (through scientific_names)...
[INFO] [2019-12-23 19:47:44] traits to occurrences...
[INFO] [2019-12-23 19:47:55] traits to nodes (through occurrences)...
[INFO] [2019-12-23 19:47:57] Traits to sex term...
[INFO] [2019-12-23 19:48:05] Traits to lifestage term...
[INFO] [2019-12-23 19:48:12] MetaTraits to traits...
[INFO] [2019-12-23 19:48:22] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-12-23 19:48:45] Assocs to occurrences...
[INFO] [2019-12-23 19:48:45] Assocs to nodes...
[INFO] [2019-12-23 19:48:45] Assoc to sex term...
[INFO] [2019-12-23 19:48:45] Assoc to lifestage term...
[STOP] [2019-12-23 19:48:45] resolve_keys
[START] [2019-12-23 19:48:45] hold_for_later_1
[STOP] [2019-12-23 19:48:45] hold_for_later_1
[START] [2019-12-23 19:48:45] hold_for_later_2
[STOP] [2019-12-23 19:48:45] hold_for_later_2
[START] [2019-12-23 19:48:45] resolve_missing_parents
[STOP] [2019-12-23 19:50:47] resolve_missing_parents
[START] [2019-12-23 19:50:47] rebuild_nodes
[START] [2019-12-23 19:50:47] Flattener#flatten
[START] [2019-12-23 19:50:47] Flattener#study_resource
[START] [2019-12-23 19:50:47] Flattener#build_ancestry
[STOP] [2019-12-23 19:54:54] Flattener#build_ancestry
[INFO] [2019-12-23 19:54:54] 115462 ancestry keys
[START] [2019-12-23 19:54:54] build_node_ancestors
[INFO] [2019-12-23 19:54:54] old ancestors deleted.
[STOP] [2019-12-23 19:56:48] build_node_ancestors
[START] [2019-12-23 19:56:51] Flattener#propagate_ancestor_ids
[STOP] [2019-12-23 19:57:22] Flattener#propagate_ancestor_ids
[STOP] [2019-12-23 19:57:22] Flattener#flatten
[STOP] [2019-12-23 19:57:22] rebuild_nodes
[START] [2019-12-23 19:57:22] resolve_missing_media_owners
[STOP] [2019-12-23 19:57:22] resolve_missing_media_owners
[START] [2019-12-23 19:57:22] sanitize_media_verbatims
[STOP] [2019-12-23 19:57:22] sanitize_media_verbatims
[START] [2019-12-23 19:57:22] queue_downloads
[STOP] [2019-12-23 19:57:22] queue_downloads
[START] [2019-12-23 19:57:22] parse_names
[WARN] [2019-12-23 19:57:22] I see 115462 names which still need to be parsed.
[STOP] [2019-12-23 19:59:03] parse_names
[START] [2019-12-23 19:59:03] denormalize_canonical_names_to_nodes
[STOP] [2019-12-23 19:59:04] denormalize_canonical_names_to_nodes
[START] [2019-12-23 19:59:04] match_nodes
[START] [2019-12-23 19:59:05] map_all_nodes_to_pages
[STOP] [2019-12-24 01:58:47] map_all_nodes_to_pages
[INFO] [2019-12-24 01:58:47] 10483 Unmatched nodes (of 115462)! That's too many to output. First 10: Corvus monedula (#62141457); Cyanopica cyana (#62146360); Spizella arborea (#62138652); Spizella domesticus (#62229575); Tiaris olivacea (#62169689); Aimophila aestivalis (#62159728); Aimophila cassinii (#62238528); Myospiza (#62233923); Myospiza aurifrons (#62233922); Molothrus oryzivora (#62194261)
[START] [2019-12-24 01:58:47] update_nodes
[STOP] [2019-12-24 01:58:52] update_nodes
[STOP] [2019-12-24 01:58:52] match_nodes
[START] [2019-12-24 01:58:52] reindex_search
[STOP] [2019-12-24 02:03:09] reindex_search
[START] [2019-12-24 02:03:09] normalize_units
[STOP] [2019-12-24 02:03:09] normalize_units
[START] [2019-12-24 02:03:09] calculate_statistics
[STOP] [2019-12-24 02:03:10] calculate_statistics
[START] [2019-12-24 02:03:10] complete_harvest_instance
[START] [2019-12-24 02:03:10] overall_tsv_creation
[INFO] [2019-12-24 02:03:10] Processing group of 115462 in 12 batches of 10000
[INFO] [2019-12-24 02:04:39] 5268 Traits (unfiltered)...
[INFO] [2019-12-24 02:04:52] 5268 Traits (filtered)...
[INFO] [2019-12-24 02:04:52] 0 Associations (filtered)...
[INFO] [2019-12-24 02:05:40] 26335 metadata added.
[INFO] [2019-12-24 02:05:40] 0 metadata added.
[INFO] [2019-12-24 02:07:12] 6326 Traits (unfiltered)...
[INFO] [2019-12-24 02:07:26] 6326 Traits (filtered)...
[INFO] [2019-12-24 02:07:26] 0 Associations (filtered)...
[INFO] [2019-12-24 02:08:16] 31623 metadata added.
[INFO] [2019-12-24 02:08:16] 0 metadata added.
[INFO] [2019-12-24 02:09:53] 6774 Traits (unfiltered)...
[INFO] [2019-12-24 02:10:06] 6774 Traits (filtered)...
[INFO] [2019-12-24 02:10:06] 0 Associations (filtered)...
[INFO] [2019-12-24 02:11:01] 33862 metadata added.
[INFO] [2019-12-24 02:11:01] 0 metadata added.
[INFO] [2019-12-24 02:12:36] 7142 Traits (unfiltered)...
[INFO] [2019-12-24 02:12:49] 7142 Traits (filtered)...
[INFO] [2019-12-24 02:12:49] 0 Associations (filtered)...
[INFO] [2019-12-24 02:13:44] 35704 metadata added.
[INFO] [2019-12-24 02:13:44] 0 metadata added.
[INFO] [2019-12-24 02:15:20] 7283 Traits (unfiltered)...
[INFO] [2019-12-24 02:15:33] 7283 Traits (filtered)...
[INFO] [2019-12-24 02:15:33] 0 Associations (filtered)...
[INFO] [2019-12-24 02:16:28] 36404 metadata added.
[INFO] [2019-12-24 02:16:28] 0 metadata added.
[INFO] [2019-12-24 02:18:03] 7464 Traits (unfiltered)...
[INFO] [2019-12-24 02:18:17] 7464 Traits (filtered)...
[INFO] [2019-12-24 02:18:17] 0 Associations (filtered)...
[INFO] [2019-12-24 02:19:11] 37308 metadata added.
[INFO] [2019-12-24 02:19:11] 0 metadata added.
[INFO] [2019-12-24 02:20:45] 7554 Traits (unfiltered)...
[INFO] [2019-12-24 02:20:58] 7554 Traits (filtered)...
[INFO] [2019-12-24 02:20:58] 0 Associations (filtered)...
[INFO] [2019-12-24 02:21:53] 37754 metadata added.
[INFO] [2019-12-24 02:21:53] 0 metadata added.
[INFO] [2019-12-24 02:23:27] 7723 Traits (unfiltered)...
[INFO] [2019-12-24 02:23:40] 7723 Traits (filtered)...
[INFO] [2019-12-24 02:23:40] 0 Associations (filtered)...
[INFO] [2019-12-24 02:24:35] 38596 metadata added.
[INFO] [2019-12-24 02:24:35] 0 metadata added.
[INFO] [2019-12-24 02:26:10] 7755 Traits (unfiltered)...
[INFO] [2019-12-24 02:26:23] 7755 Traits (filtered)...
[INFO] [2019-12-24 02:26:23] 0 Associations (filtered)...
[INFO] [2019-12-24 02:27:19] 38764 metadata added.
[INFO] [2019-12-24 02:27:19] 0 metadata added.
[INFO] [2019-12-24 02:28:53] 7906 Traits (unfiltered)...
[INFO] [2019-12-24 02:29:06] 7906 Traits (filtered)...
[INFO] [2019-12-24 02:29:06] 0 Associations (filtered)...
[INFO] [2019-12-24 02:30:02] 39512 metadata added.
[INFO] [2019-12-24 02:30:02] 0 metadata added.
[INFO] [2019-12-24 02:31:38] 8105 Traits (unfiltered)...
[INFO] [2019-12-24 02:31:51] 8105 Traits (filtered)...
[INFO] [2019-12-24 02:31:51] 0 Associations (filtered)...
[INFO] [2019-12-24 02:32:47] 40496 metadata added.
[INFO] [2019-12-24 02:32:47] 0 metadata added.
[INFO] [2019-12-24 02:34:00] 4554 Traits (unfiltered)...
[INFO] [2019-12-24 02:34:13] 4554 Traits (filtered)...
[INFO] [2019-12-24 02:34:13] 0 Associations (filtered)...
[INFO] [2019-12-24 02:34:59] 22759 metadata added.
[INFO] [2019-12-24 02:34:59] 0 metadata added.
[INFO] [2019-12-24 02:34:59] Average Time: 130.355
[INFO] [2019-12-24 02:34:59] Total Time: 31m49s
[INFO] [2019-12-24 02:34:59] last 3 / first 3: 0.99
[INFO] [2019-12-24 02:34:59] Std.Dev: 8.461855588462852; Max: 136.8
[STOP] [2019-12-24 02:34:59] overall_tsv_creation
[INFO] [2019-12-24 02:34:59] Done. Check your files:
[INFO] [2019-12-24 02:34:59] (115462 lines) /app/public/data/n_atlantic_sp_li/publish_nodes.tsv
[INFO] [2019-12-24 02:34:59] (640456 lines) /app/public/data/n_atlantic_sp_li/publish_node_ancestors.tsv
[INFO] [2019-12-24 02:34:59] (115462 lines) /app/public/data/n_atlantic_sp_li/publish_scientific_names.tsv
[INFO] [2019-12-24 02:35:00] (83855 lines) /app/public/data/n_atlantic_sp_li/publish_traits.tsv
[INFO] [2019-12-24 02:35:00] (419118 lines) /app/public/data/n_atlantic_sp_li/publish_metadata.tsv
[STOP] [2019-12-24 02:35:00] complete_harvest_instance
[START] [2019-12-24 02:35:00] completed
[STOP] [2019-12-24 02:35:00] completed
[STOP] [2019-12-24 02:35:00] logged process, took 25539.66

Latest Process