Harvest for South Pacific Species List Created 28 Dec 11:24

Stage: completed
Fetched: 28 Dec 11:24
Validated: 28 Dec 11:24
Deltas Created 28 Dec 11:24
Units Normalized: 28 Dec 15:06
Ancestry Built: 28 Dec 11:40
Nodes Matched: 28 Dec 15:02
Names Parsed: 28 Dec 11:41
New Models Stored: 28 Dec 11:33
Indexed: 28 Dec 15:06
Completed: 28 Dec 15:29
Time to Harvest: 4 minutes

Harvesting Log

(195 lines)
# Logfile created on 2019-12-28 11:24:05 -0500 by logger.rb/56815
[START] [2019-12-28 11:24:05] logged process
[START] [2019-12-28 11:24:05] create_harvest_instance
[STOP] [2019-12-28 11:24:05] create_harvest_instance
[START] [2019-12-28 11:24:05] fetch_files
[STOP] [2019-12-28 11:24:05] fetch_files
[START] [2019-12-28 11:24:05] validate_each_file
[STOP] [2019-12-28 11:24:15] validate_each_file
[START] [2019-12-28 11:24:15] convert_to_csv
[CMD] [2019-12-28 11:24:15] /usr/bin/sort /app/public/converted_csv/s_pacific_sp_lis_refs_19785.csv > /app/public/converted_csv/s_pacific_sp_lis_refs_19785.csv_sorted
[CMD] [2019-12-28 11:24:16] /usr/bin/sort /app/public/converted_csv/s_pacific_sp_lis_nodes_19786.csv > /app/public/converted_csv/s_pacific_sp_lis_nodes_19786.csv_sorted
[CMD] [2019-12-28 11:24:16] /usr/bin/sort /app/public/converted_csv/s_pacific_sp_lis_occurrences_19787.csv > /app/public/converted_csv/s_pacific_sp_lis_occurrences_19787.csv_sorted
[CMD] [2019-12-28 11:24:17] /usr/bin/sort /app/public/converted_csv/s_pacific_sp_lis_measurements_19788.csv > /app/public/converted_csv/s_pacific_sp_lis_measurements_19788.csv_sorted
[STOP] [2019-12-28 11:24:18] convert_to_csv
[START] [2019-12-28 11:24:18] calculate_delta
[CMD] [2019-12-28 11:24:18] echo "0a" > /app/public/diff/s_pacific_sp_lis_refs_19785.diff
[CMD] [2019-12-28 11:24:18] tail -n +1 /app/public/converted_csv/s_pacific_sp_lis_refs_19785.csv >> /app/public/diff/s_pacific_sp_lis_refs_19785.diff
[CMD] [2019-12-28 11:24:19] echo "." >> /app/public/diff/s_pacific_sp_lis_refs_19785.diff
[CMD] [2019-12-28 11:24:20] echo "0a" > /app/public/diff/s_pacific_sp_lis_nodes_19786.diff
[CMD] [2019-12-28 11:24:20] tail -n +1 /app/public/converted_csv/s_pacific_sp_lis_nodes_19786.csv >> /app/public/diff/s_pacific_sp_lis_nodes_19786.diff
[CMD] [2019-12-28 11:24:21] echo "." >> /app/public/diff/s_pacific_sp_lis_nodes_19786.diff
[CMD] [2019-12-28 11:24:21] echo "0a" > /app/public/diff/s_pacific_sp_lis_occurrences_19787.diff
[CMD] [2019-12-28 11:24:22] tail -n +1 /app/public/converted_csv/s_pacific_sp_lis_occurrences_19787.csv >> /app/public/diff/s_pacific_sp_lis_occurrences_19787.diff
[CMD] [2019-12-28 11:24:23] echo "." >> /app/public/diff/s_pacific_sp_lis_occurrences_19787.diff
[CMD] [2019-12-28 11:24:23] echo "0a" > /app/public/diff/s_pacific_sp_lis_measurements_19788.diff
[CMD] [2019-12-28 11:24:24] tail -n +1 /app/public/converted_csv/s_pacific_sp_lis_measurements_19788.csv >> /app/public/diff/s_pacific_sp_lis_measurements_19788.diff
[CMD] [2019-12-28 11:24:25] echo "." >> /app/public/diff/s_pacific_sp_lis_measurements_19788.diff
[STOP] [2019-12-28 11:24:25] calculate_delta
[START] [2019-12-28 11:24:25] parse_diff_and_store
[INFO] [2019-12-28 11:24:26] Loading refs diff file into memory (true lines)...
[INFO] [2019-12-28 11:24:27] Loading nodes diff file into memory (true lines)...
[WARN] [2019-12-28 11:24:46] Filtered Scientific Name `Hoplodactylus ""mokohinaus"""` to `Hoplodactylus mokohinaus`
[WARN] [2019-12-28 11:24:49] Filtered Scientific Name `Hypoplectrodes ""species` to `Hypoplectrodes species`
[INFO] [2019-12-28 11:24:56] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-12-28 11:25:04] Loading measurements diff file into memory (true lines)...
[INFO] [2019-12-28 11:30:41] Storing 2 References
[INFO] [2019-12-28 11:30:41] Processing group of 2 in 1 groups of 1000
[INFO] [2019-12-28 11:30:41] Average Time: 0.0
[INFO] [2019-12-28 11:30:41] Total Time: 1s
[INFO] [2019-12-28 11:30:41] Storing 80834 ScientificNames
[INFO] [2019-12-28 11:30:41] Processing group of 80834 in 81 groups of 1000
[INFO] [2019-12-28 11:31:19] Average Time: 0.464
[INFO] [2019-12-28 11:31:19] Total Time: 38s
[INFO] [2019-12-28 11:31:19] last 3 / first 3: 0.92
[INFO] [2019-12-28 11:31:19] Std.Dev: 0.37416573867739417; Max: 3.03
[INFO] [2019-12-28 11:31:19] Storing 80834 Nodes
[INFO] [2019-12-28 11:31:19] Processing group of 80834 in 81 groups of 1000
[INFO] [2019-12-28 11:31:50] Average Time: 0.383
[INFO] [2019-12-28 11:31:50] Total Time: 32s
[INFO] [2019-12-28 11:31:50] last 3 / first 3: 1.03
[INFO] [2019-12-28 11:31:50] Std.Dev: 0.4207136793592526; Max: 3.28
[INFO] [2019-12-28 11:31:50] Storing 58156 Occurrences
[INFO] [2019-12-28 11:31:50] Processing group of 58156 in 59 groups of 1000
[INFO] [2019-12-28 11:32:04] Average Time: 0.226
[INFO] [2019-12-28 11:32:04] Total Time: 14s
[INFO] [2019-12-28 11:32:04] last 3 / first 3: 0.94
[INFO] [2019-12-28 11:32:04] Std.Dev: 0.5300943312279429; Max: 3.36
[INFO] [2019-12-28 11:32:04] Storing 116312 TraitsReferences
[INFO] [2019-12-28 11:32:04] Processing group of 116312 in 117 groups of 1000
[INFO] [2019-12-28 11:32:12] Average Time: 0.066
[INFO] [2019-12-28 11:32:12] Total Time: 9s
[INFO] [2019-12-28 11:32:12] last 3 / first 3: 0.52
[INFO] [2019-12-28 11:32:12] Std.Dev: 0.03162277660168379; Max: 0.24
[INFO] [2019-12-28 11:32:12] Storing 116312 Traits
[INFO] [2019-12-28 11:32:12] Processing group of 116312 in 117 groups of 1000
[INFO] [2019-12-28 11:32:59] Average Time: 0.392
[INFO] [2019-12-28 11:32:59] Total Time: 47s
[INFO] [2019-12-28 11:32:59] last 3 / first 3: 0.84
[INFO] [2019-12-28 11:32:59] Std.Dev: 0.45607017003965516; Max: 3.78
[INFO] [2019-12-28 11:32:59] Storing 116160 MetaTraits
[INFO] [2019-12-28 11:32:59] Processing group of 116160 in 117 groups of 1000
[INFO] [2019-12-28 11:33:20] Average Time: 0.181
[INFO] [2019-12-28 11:33:20] Total Time: 22s
[INFO] [2019-12-28 11:33:20] last 3 / first 3: 0.62
[INFO] [2019-12-28 11:33:20] Std.Dev: 0.5079370039680118; Max: 4.04
[STOP] [2019-12-28 11:33:20] parse_diff_and_store
[START] [2019-12-28 11:33:20] resolve_keys
[INFO] [2019-12-28 11:36:00] Occurrences to nodes (through scientific_names)...
[INFO] [2019-12-28 11:36:10] traits to occurrences...
[INFO] [2019-12-28 11:36:19] traits to nodes (through occurrences)...
[INFO] [2019-12-28 11:36:20] Traits to sex term...
[INFO] [2019-12-28 11:36:27] Traits to lifestage term...
[INFO] [2019-12-28 11:36:35] MetaTraits to traits...
[INFO] [2019-12-28 11:36:42] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-12-28 11:36:58] Assocs to occurrences...
[INFO] [2019-12-28 11:36:58] Assocs to nodes...
[INFO] [2019-12-28 11:36:58] Assoc to sex term...
[INFO] [2019-12-28 11:36:58] Assoc to lifestage term...
[STOP] [2019-12-28 11:36:58] resolve_keys
[START] [2019-12-28 11:36:58] hold_for_later_1
[STOP] [2019-12-28 11:36:58] hold_for_later_1
[START] [2019-12-28 11:36:58] hold_for_later_2
[STOP] [2019-12-28 11:36:58] hold_for_later_2
[START] [2019-12-28 11:36:58] resolve_missing_parents
[STOP] [2019-12-28 11:38:39] resolve_missing_parents
[START] [2019-12-28 11:38:39] rebuild_nodes
[START] [2019-12-28 11:38:39] Flattener#flatten
[START] [2019-12-28 11:38:39] Flattener#study_resource
[START] [2019-12-28 11:38:40] Flattener#build_ancestry
[STOP] [2019-12-28 11:38:57] Flattener#build_ancestry
[INFO] [2019-12-28 11:38:57] 80834 ancestry keys
[START] [2019-12-28 11:38:57] build_node_ancestors
[INFO] [2019-12-28 11:38:57] old ancestors deleted.
[STOP] [2019-12-28 11:40:00] build_node_ancestors
[START] [2019-12-28 11:40:03] Flattener#propagate_ancestor_ids
[STOP] [2019-12-28 11:40:16] Flattener#propagate_ancestor_ids
[STOP] [2019-12-28 11:40:16] Flattener#flatten
[STOP] [2019-12-28 11:40:16] rebuild_nodes
[START] [2019-12-28 11:40:16] resolve_missing_media_owners
[STOP] [2019-12-28 11:40:16] resolve_missing_media_owners
[START] [2019-12-28 11:40:16] sanitize_media_verbatims
[STOP] [2019-12-28 11:40:16] sanitize_media_verbatims
[START] [2019-12-28 11:40:16] queue_downloads
[STOP] [2019-12-28 11:40:16] queue_downloads
[START] [2019-12-28 11:40:16] parse_names
[WARN] [2019-12-28 11:40:16] I see 80834 names which still need to be parsed.
[STOP] [2019-12-28 11:41:18] parse_names
[START] [2019-12-28 11:41:18] denormalize_canonical_names_to_nodes
[STOP] [2019-12-28 11:41:19] denormalize_canonical_names_to_nodes
[START] [2019-12-28 11:41:19] match_nodes
[START] [2019-12-28 11:41:20] map_all_nodes_to_pages
[STOP] [2019-12-28 15:02:09] map_all_nodes_to_pages
[INFO] [2019-12-28 15:02:09] 6133 Unmatched nodes (of 80834)! That's too many to output. First 10: Onychoprion fuscata (#62659603); Chroicocephalus scopulinus (#62659574); Chlidonias nigra (#62715970); Thalaseus bergii (#62660557); Thalaseus maximus (#62661110); Thalaseus sandvicensis (#62661716); Thalaseus bengalensis (#62705598); Thalaseus maxima (#62738266); Thalasseus maxima (#62729446); Rynchops nigra (#62717115)
[START] [2019-12-28 15:02:09] update_nodes
[STOP] [2019-12-28 15:02:39] update_nodes
[STOP] [2019-12-28 15:02:39] match_nodes
[START] [2019-12-28 15:02:39] reindex_search
[STOP] [2019-12-28 15:06:00] reindex_search
[START] [2019-12-28 15:06:00] normalize_units
[STOP] [2019-12-28 15:06:32] normalize_units
[START] [2019-12-28 15:06:32] calculate_statistics
[STOP] [2019-12-28 15:06:33] calculate_statistics
[START] [2019-12-28 15:06:33] complete_harvest_instance
[START] [2019-12-28 15:06:33] overall_tsv_creation
[INFO] [2019-12-28 15:06:33] Processing group of 80834 in 9 batches of 10000
[INFO] [2019-12-28 15:08:00] 5633 Traits (unfiltered)...
[INFO] [2019-12-28 15:08:14] 5633 Traits (filtered)...
[INFO] [2019-12-28 15:08:14] 0 Associations (filtered)...
[INFO] [2019-12-28 15:09:03] 28155 metadata added.
[INFO] [2019-12-28 15:09:03] 0 metadata added.
[INFO] [2019-12-28 15:10:37] 6486 Traits (unfiltered)...
[INFO] [2019-12-28 15:10:50] 6486 Traits (filtered)...
[INFO] [2019-12-28 15:10:50] 0 Associations (filtered)...
[INFO] [2019-12-28 15:11:43] 32420 metadata added.
[INFO] [2019-12-28 15:11:43] 0 metadata added.
[INFO] [2019-12-28 15:13:16] 6981 Traits (unfiltered)...
[INFO] [2019-12-28 15:13:30] 6981 Traits (filtered)...
[INFO] [2019-12-28 15:13:30] 0 Associations (filtered)...
[INFO] [2019-12-28 15:14:22] 34897 metadata added.
[INFO] [2019-12-28 15:14:22] 0 metadata added.
[INFO] [2019-12-28 15:15:56] 7339 Traits (unfiltered)...
[INFO] [2019-12-28 15:16:10] 7339 Traits (filtered)...
[INFO] [2019-12-28 15:16:10] 0 Associations (filtered)...
[INFO] [2019-12-28 15:17:01] 36675 metadata added.
[INFO] [2019-12-28 15:17:01] 0 metadata added.
[INFO] [2019-12-28 15:18:40] 7527 Traits (unfiltered)...
[INFO] [2019-12-28 15:18:53] 7527 Traits (filtered)...
[INFO] [2019-12-28 15:18:53] 0 Associations (filtered)...
[INFO] [2019-12-28 15:19:48] 37617 metadata added.
[INFO] [2019-12-28 15:19:48] 0 metadata added.
[INFO] [2019-12-28 15:21:23] 7556 Traits (unfiltered)...
[INFO] [2019-12-28 15:21:36] 7556 Traits (filtered)...
[INFO] [2019-12-28 15:21:36] 0 Associations (filtered)...
[INFO] [2019-12-28 15:22:31] 37757 metadata added.
[INFO] [2019-12-28 15:22:31] 0 metadata added.
[INFO] [2019-12-28 15:24:06] 7807 Traits (unfiltered)...
[INFO] [2019-12-28 15:24:19] 7807 Traits (filtered)...
[INFO] [2019-12-28 15:24:19] 0 Associations (filtered)...
[INFO] [2019-12-28 15:25:13] 39014 metadata added.
[INFO] [2019-12-28 15:25:13] 0 metadata added.
[INFO] [2019-12-28 15:26:48] 8145 Traits (unfiltered)...
[INFO] [2019-12-28 15:27:01] 8145 Traits (filtered)...
[INFO] [2019-12-28 15:27:01] 0 Associations (filtered)...
[INFO] [2019-12-28 15:27:59] 40687 metadata added.
[INFO] [2019-12-28 15:27:59] 0 metadata added.
[INFO] [2019-12-28 15:28:47] 682 Traits (unfiltered)...
[INFO] [2019-12-28 15:29:00] 682 Traits (filtered)...
[INFO] [2019-12-28 15:29:00] 0 Associations (filtered)...
[INFO] [2019-12-28 15:29:38] 3406 metadata added.
[INFO] [2019-12-28 15:29:38] 0 metadata added.
[INFO] [2019-12-28 15:29:38] Average Time: 124.927
[INFO] [2019-12-28 15:29:38] Total Time: 23m5s
[INFO] [2019-12-28 15:29:38] last 3 / first 3: 0.89
[INFO] [2019-12-28 15:29:38] Std.Dev: 18.698449133551158; Max: 134.59
[STOP] [2019-12-28 15:29:38] overall_tsv_creation
[INFO] [2019-12-28 15:29:38] Done. Check your files:
[INFO] [2019-12-28 15:29:38] (80834 lines) /app/public/data/s_pacific_sp_lis/publish_nodes.tsv
[INFO] [2019-12-28 15:29:39] (448763 lines) /app/public/data/s_pacific_sp_lis/publish_node_ancestors.tsv
[INFO] [2019-12-28 15:29:39] (80834 lines) /app/public/data/s_pacific_sp_lis/publish_scientific_names.tsv
[INFO] [2019-12-28 15:29:40] (58157 lines) /app/public/data/s_pacific_sp_lis/publish_traits.tsv
[INFO] [2019-12-28 15:29:41] (290629 lines) /app/public/data/s_pacific_sp_lis/publish_metadata.tsv
[STOP] [2019-12-28 15:29:41] complete_harvest_instance
[START] [2019-12-28 15:29:41] completed
[STOP] [2019-12-28 15:29:41] completed
[STOP] [2019-12-28 15:29:41] logged process, took 14736.34

Latest Process