Harvest for Brazil Species List Created 11 Oct 20:41

Stage: completed
Fetched: 11 Oct 20:41
Validated: 11 Oct 20:41
Deltas Created 11 Oct 20:41
Units Normalized: 11 Oct 22:36
Ancestry Built: 11 Oct 21:09
Nodes Matched: 11 Oct 22:33
Names Parsed: 11 Oct 21:11
New Models Stored: 11 Oct 20:57
Indexed: 11 Oct 22:36
Completed: 11 Oct 23:07
Time to Harvest: 2 minutes

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2019-10-11 20:41:30 -0400 by logger.rb/56815
[START] [2019-10-11 20:41:30] logged process
[START] [2019-10-11 20:41:30] create_harvest_instance
[STOP] [2019-10-11 20:41:30] create_harvest_instance
[START] [2019-10-11 20:41:30] fetch_files
[STOP] [2019-10-11 20:41:30] fetch_files
[START] [2019-10-11 20:41:30] validate_each_file
[STOP] [2019-10-11 20:41:45] validate_each_file
[START] [2019-10-11 20:41:45] convert_to_csv
[CMD] [2019-10-11 20:41:45] /usr/bin/sort /app/public/converted_csv/brazil_sp_list_refs_15299.csv > /app/public/converted_csv/brazil_sp_list_refs_15299.csv_sorted
[CMD] [2019-10-11 20:41:45] /usr/bin/sort /app/public/converted_csv/brazil_sp_list_nodes_15300.csv > /app/public/converted_csv/brazil_sp_list_nodes_15300.csv_sorted
[CMD] [2019-10-11 20:41:45] /usr/bin/sort /app/public/converted_csv/brazil_sp_list_occurrences_15301.csv > /app/public/converted_csv/brazil_sp_list_occurrences_15301.csv_sorted
[CMD] [2019-10-11 20:41:45] /usr/bin/sort /app/public/converted_csv/brazil_sp_list_measurements_15302.csv > /app/public/converted_csv/brazil_sp_list_measurements_15302.csv_sorted
[STOP] [2019-10-11 20:41:46] convert_to_csv
[START] [2019-10-11 20:41:46] calculate_delta
[CMD] [2019-10-11 20:41:46] echo "0a" > /app/public/diff/brazil_sp_list_refs_15299.diff
[CMD] [2019-10-11 20:41:46] tail -n +1 /app/public/converted_csv/brazil_sp_list_refs_15299.csv >> /app/public/diff/brazil_sp_list_refs_15299.diff
[CMD] [2019-10-11 20:41:46] echo "." >> /app/public/diff/brazil_sp_list_refs_15299.diff
[CMD] [2019-10-11 20:41:46] echo "0a" > /app/public/diff/brazil_sp_list_nodes_15300.diff
[CMD] [2019-10-11 20:41:46] tail -n +1 /app/public/converted_csv/brazil_sp_list_nodes_15300.csv >> /app/public/diff/brazil_sp_list_nodes_15300.diff
[CMD] [2019-10-11 20:41:46] echo "." >> /app/public/diff/brazil_sp_list_nodes_15300.diff
[CMD] [2019-10-11 20:41:46] echo "0a" > /app/public/diff/brazil_sp_list_occurrences_15301.diff
[CMD] [2019-10-11 20:41:46] tail -n +1 /app/public/converted_csv/brazil_sp_list_occurrences_15301.csv >> /app/public/diff/brazil_sp_list_occurrences_15301.diff
[CMD] [2019-10-11 20:41:46] echo "." >> /app/public/diff/brazil_sp_list_occurrences_15301.diff
[CMD] [2019-10-11 20:41:46] echo "0a" > /app/public/diff/brazil_sp_list_measurements_15302.diff
[CMD] [2019-10-11 20:41:47] tail -n +1 /app/public/converted_csv/brazil_sp_list_measurements_15302.csv >> /app/public/diff/brazil_sp_list_measurements_15302.diff
[CMD] [2019-10-11 20:41:47] echo "." >> /app/public/diff/brazil_sp_list_measurements_15302.diff
[STOP] [2019-10-11 20:41:47] calculate_delta
[START] [2019-10-11 20:41:47] parse_diff_and_store
[INFO] [2019-10-11 20:41:47] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-11 20:41:47] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-11 20:42:29] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-11 20:42:42] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-11 20:52:09] Storing 2 References
[INFO] [2019-10-11 20:52:09] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-11 20:52:09] Average Time: 0.0
[INFO] [2019-10-11 20:52:09] Total Time: 1s
[INFO] [2019-10-11 20:52:09] Storing 107705 ScientificNames
[INFO] [2019-10-11 20:52:09] Processing group of 107705 in 108 groups of 1000
[INFO] [2019-10-11 20:53:09] Average Time: 0.543
[INFO] [2019-10-11 20:53:09] Total Time: 60s
[INFO] [2019-10-11 20:53:09] last 3 / first 3: 0.87
[INFO] [2019-10-11 20:53:09] Std.Dev: 0.6511528238439882; Max: 4.44
[INFO] [2019-10-11 20:53:09] Storing 107705 Nodes
[INFO] [2019-10-11 20:53:09] Processing group of 107705 in 108 groups of 1000
[INFO] [2019-10-11 20:54:04] Average Time: 0.507
[INFO] [2019-10-11 20:54:04] Total Time: 56s
[INFO] [2019-10-11 20:54:04] last 3 / first 3: 0.95
[INFO] [2019-10-11 20:54:04] Std.Dev: 0.8420213774008354; Max: 4.94
[INFO] [2019-10-11 20:54:04] Storing 90247 Occurrences
[INFO] [2019-10-11 20:54:04] Processing group of 90247 in 91 groups of 1000
[INFO] [2019-10-11 20:54:25] Average Time: 0.232
[INFO] [2019-10-11 20:54:25] Total Time: 22s
[INFO] [2019-10-11 20:54:25] last 3 / first 3: 0.47
[INFO] [2019-10-11 20:54:25] Std.Dev: 0.6992853494818835; Max: 4.87
[INFO] [2019-10-11 20:54:25] Storing 180494 TraitsReferences
[INFO] [2019-10-11 20:54:25] Processing group of 180494 in 181 groups of 1000
[INFO] [2019-10-11 20:54:39] Average Time: 0.069
[INFO] [2019-10-11 20:54:39] Total Time: 14s
[INFO] [2019-10-11 20:54:39] last 3 / first 3: 0.56
[INFO] [2019-10-11 20:54:39] Std.Dev: 0.03162277660168379; Max: 0.27
[INFO] [2019-10-11 20:54:39] Storing 180494 Traits
[INFO] [2019-10-11 20:54:39] Processing group of 180494 in 181 groups of 1000
[INFO] [2019-10-11 20:56:20] Average Time: 0.553
[INFO] [2019-10-11 20:56:20] Total Time: 1m41s
[INFO] [2019-10-11 20:56:20] last 3 / first 3: 0.54
[INFO] [2019-10-11 20:56:20] Std.Dev: 1.0774970997640783; Max: 5.82
[INFO] [2019-10-11 20:56:20] Storing 180234 MetaTraits
[INFO] [2019-10-11 20:56:20] Processing group of 180234 in 181 groups of 1000
[INFO] [2019-10-11 20:57:03] Average Time: 0.236
[INFO] [2019-10-11 20:57:03] Total Time: 44s
[INFO] [2019-10-11 20:57:03] last 3 / first 3: 0.5
[INFO] [2019-10-11 20:57:03] Std.Dev: 0.7949842765740717; Max: 7.06
[STOP] [2019-10-11 20:57:03] parse_diff_and_store
[START] [2019-10-11 20:57:03] resolve_keys
[INFO] [2019-10-11 21:00:35] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-11 21:00:46] traits to occurrences...
[INFO] [2019-10-11 21:00:58] traits to nodes (through occurrences)...
[INFO] [2019-10-11 21:01:00] Traits to sex term...
[INFO] [2019-10-11 21:01:08] Traits to lifestage term...
[INFO] [2019-10-11 21:01:17] MetaTraits to traits...
[INFO] [2019-10-11 21:01:28] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-11 21:01:52] Assocs to occurrences...
[INFO] [2019-10-11 21:01:52] Assocs to nodes...
[INFO] [2019-10-11 21:01:52] Assoc to sex term...
[INFO] [2019-10-11 21:01:52] Assoc to lifestage term...
[STOP] [2019-10-11 21:01:52] resolve_keys
[START] [2019-10-11 21:01:52] hold_for_later_1
[STOP] [2019-10-11 21:01:52] hold_for_later_1
[START] [2019-10-11 21:01:53] hold_for_later_2
[STOP] [2019-10-11 21:01:53] hold_for_later_2
[START] [2019-10-11 21:01:53] resolve_missing_parents
[STOP] [2019-10-11 21:04:46] resolve_missing_parents
[START] [2019-10-11 21:04:46] rebuild_nodes
[START] [2019-10-11 21:04:46] Flattener#flatten
[START] [2019-10-11 21:04:46] Flattener#study_resource
[START] [2019-10-11 21:04:47] Flattener#build_ancestry
[STOP] [2019-10-11 21:07:33] Flattener#build_ancestry
[INFO] [2019-10-11 21:07:33] 107705 ancestry keys
[START] [2019-10-11 21:07:33] build_node_ancestors
[INFO] [2019-10-11 21:07:33] old ancestors deleted.
[STOP] [2019-10-11 21:09:14] build_node_ancestors
[START] [2019-10-11 21:09:16] Flattener#propagate_ancestor_ids
[STOP] [2019-10-11 21:09:42] Flattener#propagate_ancestor_ids
[STOP] [2019-10-11 21:09:42] Flattener#flatten
[STOP] [2019-10-11 21:09:42] rebuild_nodes
[START] [2019-10-11 21:09:42] resolve_missing_media_owners
[STOP] [2019-10-11 21:09:42] resolve_missing_media_owners
[START] [2019-10-11 21:09:42] sanitize_media_verbatims
[STOP] [2019-10-11 21:09:42] sanitize_media_verbatims
[START] [2019-10-11 21:09:42] queue_downloads
[STOP] [2019-10-11 21:09:42] queue_downloads
[START] [2019-10-11 21:09:42] parse_names
[WARN] [2019-10-11 21:09:42] I see 107705 names which still need to be parsed.
[STOP] [2019-10-11 21:11:07] parse_names
[START] [2019-10-11 21:11:07] denormalize_canonical_names_to_nodes
[STOP] [2019-10-11 21:11:08] denormalize_canonical_names_to_nodes
[START] [2019-10-11 21:11:08] match_nodes
[START] [2019-10-11 21:11:08] map_all_nodes_to_pages
[STOP] [2019-10-11 22:33:05] map_all_nodes_to_pages
[INFO] [2019-10-11 22:33:05] 9176 Unmatched nodes (of 107705)! That's too many to output. First 10: Magnoliophyta (#48861872); Magnoliopsida (#48861871); Manihot carthaginensis (#48870260); Manihot paviaefolia (#48891500); Manihot purpureo-costata (#48891834); Manihot sagittato-partita (#48912908); Manihot tweediana (#48914230); Manihot longipetiolata (#48925247); Manihot speciosa (#48952292); Manihot riedeliana (#48956784)
[START] [2019-10-11 22:33:05] update_nodes
[STOP] [2019-10-11 22:33:07] update_nodes
[STOP] [2019-10-11 22:33:07] match_nodes
[START] [2019-10-11 22:33:07] reindex_search
[STOP] [2019-10-11 22:36:47] reindex_search
[START] [2019-10-11 22:36:47] normalize_units
[STOP] [2019-10-11 22:36:47] normalize_units
[START] [2019-10-11 22:36:47] calculate_statistics
[STOP] [2019-10-11 22:36:47] calculate_statistics
[START] [2019-10-11 22:36:47] complete_harvest_instance
[START] [2019-10-11 22:36:47] overall_tsv_creation
[INFO] [2019-10-11 22:36:48] Processing group of 107705 in 11 batches of 10000
[INFO] [2019-10-11 22:38:18] 6790 Traits (unfiltered)...
[INFO] [2019-10-11 22:38:32] 6790 Traits (filtered)...
[INFO] [2019-10-11 22:38:32] 0 Associations (filtered)...
[INFO] [2019-10-11 22:39:22] 33948 metadata added.
[INFO] [2019-10-11 22:39:22] 0 metadata added.
[INFO] [2019-10-11 22:40:58] 8176 Traits (unfiltered)...
[INFO] [2019-10-11 22:41:12] 8176 Traits (filtered)...
[INFO] [2019-10-11 22:41:12] 0 Associations (filtered)...
[INFO] [2019-10-11 22:42:10] 40875 metadata added.
[INFO] [2019-10-11 22:42:10] 0 metadata added.
[INFO] [2019-10-11 22:43:46] 8504 Traits (unfiltered)...
[INFO] [2019-10-11 22:44:00] 8504 Traits (filtered)...
[INFO] [2019-10-11 22:44:00] 0 Associations (filtered)...
[INFO] [2019-10-11 22:44:59] 42515 metadata added.
[INFO] [2019-10-11 22:44:59] 0 metadata added.
[INFO] [2019-10-11 22:46:38] 8470 Traits (unfiltered)...
[INFO] [2019-10-11 22:46:52] 8470 Traits (filtered)...
[INFO] [2019-10-11 22:46:52] 0 Associations (filtered)...
[INFO] [2019-10-11 22:47:49] 42339 metadata added.
[INFO] [2019-10-11 22:47:49] 0 metadata added.
[INFO] [2019-10-11 22:49:26] 8415 Traits (unfiltered)...
[INFO] [2019-10-11 22:49:40] 8415 Traits (filtered)...
[INFO] [2019-10-11 22:49:40] 0 Associations (filtered)...
[INFO] [2019-10-11 22:50:37] 42067 metadata added.
[INFO] [2019-10-11 22:50:37] 0 metadata added.
[INFO] [2019-10-11 22:52:13] 8450 Traits (unfiltered)...
[INFO] [2019-10-11 22:52:27] 8450 Traits (filtered)...
[INFO] [2019-10-11 22:52:27] 0 Associations (filtered)...
[INFO] [2019-10-11 22:53:24] 42227 metadata added.
[INFO] [2019-10-11 22:53:24] 0 metadata added.
[INFO] [2019-10-11 22:55:02] 8563 Traits (unfiltered)...
[INFO] [2019-10-11 22:55:16] 8563 Traits (filtered)...
[INFO] [2019-10-11 22:55:16] 0 Associations (filtered)...
[INFO] [2019-10-11 22:56:14] 42785 metadata added.
[INFO] [2019-10-11 22:56:14] 0 metadata added.
[INFO] [2019-10-11 22:57:52] 8421 Traits (unfiltered)...
[INFO] [2019-10-11 22:58:06] 8421 Traits (filtered)...
[INFO] [2019-10-11 22:58:06] 0 Associations (filtered)...
[INFO] [2019-10-11 22:59:03] 42075 metadata added.
[INFO] [2019-10-11 22:59:03] 0 metadata added.
[INFO] [2019-10-11 23:00:44] 8671 Traits (unfiltered)...
[INFO] [2019-10-11 23:00:58] 8671 Traits (filtered)...
[INFO] [2019-10-11 23:00:58] 0 Associations (filtered)...
[INFO] [2019-10-11 23:01:55] 43303 metadata added.
[INFO] [2019-10-11 23:01:55] 0 metadata added.
[INFO] [2019-10-11 23:03:33] 8893 Traits (unfiltered)...
[INFO] [2019-10-11 23:03:47] 8893 Traits (filtered)...
[INFO] [2019-10-11 23:03:47] 0 Associations (filtered)...
[INFO] [2019-10-11 23:04:45] 44417 metadata added.
[INFO] [2019-10-11 23:04:45] 0 metadata added.
[INFO] [2019-10-11 23:06:10] 6894 Traits (unfiltered)...
[INFO] [2019-10-11 23:06:24] 6894 Traits (filtered)...
[INFO] [2019-10-11 23:06:25] 0 Associations (filtered)...
[INFO] [2019-10-11 23:07:19] 34424 metadata added.
[INFO] [2019-10-11 23:07:19] 0 metadata added.
[INFO] [2019-10-11 23:07:19] Average Time: 138.344
[INFO] [2019-10-11 23:07:19] Total Time: 30m32s
[INFO] [2019-10-11 23:07:19] last 3 / first 3: 1.01
[INFO] [2019-10-11 23:07:19] Std.Dev: 5.284316417475396; Max: 142.54
[STOP] [2019-10-11 23:07:19] overall_tsv_creation
[INFO] [2019-10-11 23:07:19] Done. Check your files:
[INFO] [2019-10-11 23:07:19] (107705 lines) /app/public/data/brazil_sp_list/publish_nodes.tsv
[INFO] [2019-10-11 23:07:19] (619534 lines) /app/public/data/brazil_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-11 23:07:19] (107705 lines) /app/public/data/brazil_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-11 23:07:19] (90248 lines) /app/public/data/brazil_sp_list/publish_traits.tsv
[INFO] [2019-10-11 23:07:20] (450976 lines) /app/public/data/brazil_sp_list/publish_metadata.tsv
[STOP] [2019-10-11 23:07:20] complete_harvest_instance
[START] [2019-10-11 23:07:20] completed
[STOP] [2019-10-11 23:07:20] completed
[STOP] [2019-10-11 23:07:20] logged process, took 8750.04

Latest Process