Harvest for Gulf of Mexico Species List Created 23 Dec 08:48

Stage: completed
Fetched: 23 Dec 08:48
Validated: 23 Dec 08:48
Deltas Created 23 Dec 08:48
Units Normalized: 23 Dec 10:41
Ancestry Built: 23 Dec 08:53
Nodes Matched: 23 Dec 10:40
Names Parsed: 23 Dec 08:54
New Models Stored: 23 Dec 08:51
Indexed: 23 Dec 10:41
Completed: 23 Dec 10:48
Time to Harvest: 2 minutes

Harvesting Log

(161 lines)
# Logfile created on 2019-12-23 08:48:52 -0500 by logger.rb/56815
[START] [2019-12-23 08:48:52] logged process
[START] [2019-12-23 08:48:52] create_harvest_instance
[STOP] [2019-12-23 08:48:53] create_harvest_instance
[START] [2019-12-23 08:48:53] fetch_files
[STOP] [2019-12-23 08:48:53] fetch_files
[START] [2019-12-23 08:48:53] validate_each_file
[STOP] [2019-12-23 08:48:56] validate_each_file
[START] [2019-12-23 08:48:56] convert_to_csv
[CMD] [2019-12-23 08:48:56] /usr/bin/sort /app/public/converted_csv/gulf_mexico_sp_l_refs_19322.csv > /app/public/converted_csv/gulf_mexico_sp_l_refs_19322.csv_sorted
[CMD] [2019-12-23 08:48:56] /usr/bin/sort /app/public/converted_csv/gulf_mexico_sp_l_nodes_19323.csv > /app/public/converted_csv/gulf_mexico_sp_l_nodes_19323.csv_sorted
[CMD] [2019-12-23 08:48:56] /usr/bin/sort /app/public/converted_csv/gulf_mexico_sp_l_occurrences_19324.csv > /app/public/converted_csv/gulf_mexico_sp_l_occurrences_19324.csv_sorted
[CMD] [2019-12-23 08:48:56] /usr/bin/sort /app/public/converted_csv/gulf_mexico_sp_l_measurements_19325.csv > /app/public/converted_csv/gulf_mexico_sp_l_measurements_19325.csv_sorted
[STOP] [2019-12-23 08:48:56] convert_to_csv
[START] [2019-12-23 08:48:56] calculate_delta
[CMD] [2019-12-23 08:48:56] echo "0a" > /app/public/diff/gulf_mexico_sp_l_refs_19322.diff
[CMD] [2019-12-23 08:48:57] tail -n +1 /app/public/converted_csv/gulf_mexico_sp_l_refs_19322.csv >> /app/public/diff/gulf_mexico_sp_l_refs_19322.diff
[CMD] [2019-12-23 08:48:57] echo "." >> /app/public/diff/gulf_mexico_sp_l_refs_19322.diff
[CMD] [2019-12-23 08:48:57] echo "0a" > /app/public/diff/gulf_mexico_sp_l_nodes_19323.diff
[CMD] [2019-12-23 08:48:57] tail -n +1 /app/public/converted_csv/gulf_mexico_sp_l_nodes_19323.csv >> /app/public/diff/gulf_mexico_sp_l_nodes_19323.diff
[CMD] [2019-12-23 08:48:57] echo "." >> /app/public/diff/gulf_mexico_sp_l_nodes_19323.diff
[CMD] [2019-12-23 08:48:57] echo "0a" > /app/public/diff/gulf_mexico_sp_l_occurrences_19324.diff
[CMD] [2019-12-23 08:48:57] tail -n +1 /app/public/converted_csv/gulf_mexico_sp_l_occurrences_19324.csv >> /app/public/diff/gulf_mexico_sp_l_occurrences_19324.diff
[CMD] [2019-12-23 08:48:57] echo "." >> /app/public/diff/gulf_mexico_sp_l_occurrences_19324.diff
[CMD] [2019-12-23 08:48:57] echo "0a" > /app/public/diff/gulf_mexico_sp_l_measurements_19325.diff
[CMD] [2019-12-23 08:48:57] tail -n +1 /app/public/converted_csv/gulf_mexico_sp_l_measurements_19325.csv >> /app/public/diff/gulf_mexico_sp_l_measurements_19325.diff
[CMD] [2019-12-23 08:48:57] echo "." >> /app/public/diff/gulf_mexico_sp_l_measurements_19325.diff
[STOP] [2019-12-23 08:48:57] calculate_delta
[START] [2019-12-23 08:48:57] parse_diff_and_store
[INFO] [2019-12-23 08:48:57] Loading refs diff file into memory (true lines)...
[INFO] [2019-12-23 08:48:57] Loading nodes diff file into memory (true lines)...
[INFO] [2019-12-23 08:49:06] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-12-23 08:49:08] Loading measurements diff file into memory (true lines)...
[INFO] [2019-12-23 08:50:31] Storing 2 References
[INFO] [2019-12-23 08:50:31] Processing group of 2 in 1 groups of 1000
[INFO] [2019-12-23 08:50:31] Average Time: 0.0
[INFO] [2019-12-23 08:50:31] Total Time: 1s
[INFO] [2019-12-23 08:50:31] Storing 22622 ScientificNames
[INFO] [2019-12-23 08:50:31] Processing group of 22622 in 23 groups of 1000
[INFO] [2019-12-23 08:50:41] Average Time: 0.428
[INFO] [2019-12-23 08:50:41] Total Time: 10s
[INFO] [2019-12-23 08:50:41] last 3 / first 3: 0.92
[INFO] [2019-12-23 08:50:41] Std.Dev: 0.19235384061671346; Max: 1.28
[INFO] [2019-12-23 08:50:41] Storing 22622 Nodes
[INFO] [2019-12-23 08:50:41] Processing group of 22622 in 23 groups of 1000
[INFO] [2019-12-23 08:50:48] Average Time: 0.315
[INFO] [2019-12-23 08:50:48] Total Time: 8s
[INFO] [2019-12-23 08:50:48] last 3 / first 3: 0.8
[INFO] [2019-12-23 08:50:48] Std.Dev: 0.044721359549995794; Max: 0.4
[INFO] [2019-12-23 08:50:48] Storing 13939 Occurrences
[INFO] [2019-12-23 08:50:48] Processing group of 13939 in 14 groups of 1000
[INFO] [2019-12-23 08:50:50] Average Time: 0.126
[INFO] [2019-12-23 08:50:50] Total Time: 2s
[INFO] [2019-12-23 08:50:50] last 3 / first 3: 0.84
[INFO] [2019-12-23 08:50:50] Std.Dev: 0.0; Max: 0.2
[INFO] [2019-12-23 08:50:50] Storing 27878 TraitsReferences
[INFO] [2019-12-23 08:50:50] Processing group of 27878 in 28 groups of 1000
[INFO] [2019-12-23 08:50:54] Average Time: 0.12
[INFO] [2019-12-23 08:50:54] Total Time: 4s
[INFO] [2019-12-23 08:50:54] last 3 / first 3: 3.13
[INFO] [2019-12-23 08:50:54] Std.Dev: 0.12649110640673517; Max: 0.76
[INFO] [2019-12-23 08:50:54] Storing 27878 Traits
[INFO] [2019-12-23 08:50:54] Processing group of 27878 in 28 groups of 1000
[INFO] [2019-12-23 08:51:05] Average Time: 0.382
[INFO] [2019-12-23 08:51:05] Total Time: 11s
[INFO] [2019-12-23 08:51:05] last 3 / first 3: 0.46
[INFO] [2019-12-23 08:51:05] Std.Dev: 0.20976176963403032; Max: 1.44
[INFO] [2019-12-23 08:51:05] Storing 27865 MetaTraits
[INFO] [2019-12-23 08:51:05] Processing group of 27865 in 28 groups of 1000
[INFO] [2019-12-23 08:51:10] Average Time: 0.19
[INFO] [2019-12-23 08:51:10] Total Time: 6s
[INFO] [2019-12-23 08:51:10] last 3 / first 3: 3.3
[INFO] [2019-12-23 08:51:10] Std.Dev: 0.2024845673131659; Max: 1.22
[STOP] [2019-12-23 08:51:10] parse_diff_and_store
[START] [2019-12-23 08:51:10] resolve_keys
[INFO] [2019-12-23 08:52:18] Occurrences to nodes (through scientific_names)...
[INFO] [2019-12-23 08:52:24] traits to occurrences...
[INFO] [2019-12-23 08:52:29] traits to nodes (through occurrences)...
[INFO] [2019-12-23 08:52:29] Traits to sex term...
[INFO] [2019-12-23 08:52:33] Traits to lifestage term...
[INFO] [2019-12-23 08:52:38] MetaTraits to traits...
[INFO] [2019-12-23 08:52:40] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-12-23 08:52:44] Assocs to occurrences...
[INFO] [2019-12-23 08:52:44] Assocs to nodes...
[INFO] [2019-12-23 08:52:44] Assoc to sex term...
[INFO] [2019-12-23 08:52:44] Assoc to lifestage term...
[STOP] [2019-12-23 08:52:44] resolve_keys
[START] [2019-12-23 08:52:44] hold_for_later_1
[STOP] [2019-12-23 08:52:44] hold_for_later_1
[START] [2019-12-23 08:52:44] hold_for_later_2
[STOP] [2019-12-23 08:52:44] hold_for_later_2
[START] [2019-12-23 08:52:44] resolve_missing_parents
[STOP] [2019-12-23 08:53:27] resolve_missing_parents
[START] [2019-12-23 08:53:27] rebuild_nodes
[START] [2019-12-23 08:53:27] Flattener#flatten
[START] [2019-12-23 08:53:27] Flattener#study_resource
[START] [2019-12-23 08:53:27] Flattener#build_ancestry
[STOP] [2019-12-23 08:53:31] Flattener#build_ancestry
[INFO] [2019-12-23 08:53:31] 22622 ancestry keys
[START] [2019-12-23 08:53:31] build_node_ancestors
[INFO] [2019-12-23 08:53:31] old ancestors deleted.
[STOP] [2019-12-23 08:53:48] build_node_ancestors
[START] [2019-12-23 08:53:50] Flattener#propagate_ancestor_ids
[STOP] [2019-12-23 08:53:53] Flattener#propagate_ancestor_ids
[STOP] [2019-12-23 08:53:53] Flattener#flatten
[STOP] [2019-12-23 08:53:53] rebuild_nodes
[START] [2019-12-23 08:53:53] resolve_missing_media_owners
[STOP] [2019-12-23 08:53:53] resolve_missing_media_owners
[START] [2019-12-23 08:53:53] sanitize_media_verbatims
[STOP] [2019-12-23 08:53:53] sanitize_media_verbatims
[START] [2019-12-23 08:53:53] queue_downloads
[STOP] [2019-12-23 08:53:53] queue_downloads
[START] [2019-12-23 08:53:53] parse_names
[WARN] [2019-12-23 08:53:53] I see 22622 names which still need to be parsed.
[STOP] [2019-12-23 08:54:12] parse_names
[START] [2019-12-23 08:54:12] denormalize_canonical_names_to_nodes
[STOP] [2019-12-23 08:54:12] denormalize_canonical_names_to_nodes
[START] [2019-12-23 08:54:12] match_nodes
[START] [2019-12-23 08:54:12] map_all_nodes_to_pages
[STOP] [2019-12-23 10:39:43] map_all_nodes_to_pages
[INFO] [2019-12-23 10:39:43] 1054 Unmatched nodes (of 22622)! That's too many to output. First 10: Phalacrocorax olivaceus (#61947384); Larus thayeri (#61935758); Larus audouinii (#61953919); Thalasseus maxima (#61934734); Limnodromus (#61932408); Tringa semipalmatus (#61934812); Philomachus (#61942676); Philomachus pugnax (#61942675); Anas discors (#61932397); Anas clypeata (#61932462)
[START] [2019-12-23 10:39:43] update_nodes
[STOP] [2019-12-23 10:40:03] update_nodes
[STOP] [2019-12-23 10:40:03] match_nodes
[START] [2019-12-23 10:40:03] reindex_search
[STOP] [2019-12-23 10:41:06] reindex_search
[START] [2019-12-23 10:41:06] normalize_units
[STOP] [2019-12-23 10:41:06] normalize_units
[START] [2019-12-23 10:41:06] calculate_statistics
[STOP] [2019-12-23 10:41:06] calculate_statistics
[START] [2019-12-23 10:41:06] complete_harvest_instance
[START] [2019-12-23 10:41:06] overall_tsv_creation
[INFO] [2019-12-23 10:41:07] Processing group of 22622 in 3 batches of 10000
[INFO] [2019-12-23 10:42:44] 5472 Traits (unfiltered)...
[INFO] [2019-12-23 10:42:57] 5472 Traits (filtered)...
[INFO] [2019-12-23 10:42:57] 0 Associations (filtered)...
[INFO] [2019-12-23 10:43:48] 27357 metadata added.
[INFO] [2019-12-23 10:43:48] 0 metadata added.
[INFO] [2019-12-23 10:45:27] 6606 Traits (unfiltered)...
[INFO] [2019-12-23 10:45:40] 6606 Traits (filtered)...
[INFO] [2019-12-23 10:45:40] 0 Associations (filtered)...
[INFO] [2019-12-23 10:46:35] 33022 metadata added.
[INFO] [2019-12-23 10:46:35] 0 metadata added.
[INFO] [2019-12-23 10:47:36] 1861 Traits (unfiltered)...
[INFO] [2019-12-23 10:47:49] 1861 Traits (filtered)...
[INFO] [2019-12-23 10:47:49] 0 Associations (filtered)...
[INFO] [2019-12-23 10:48:30] 9303 metadata added.
[INFO] [2019-12-23 10:48:30] 0 metadata added.
[INFO] [2019-12-23 10:48:30] Average Time: 118.893
[INFO] [2019-12-23 10:48:30] Total Time: 7m24s
[STOP] [2019-12-23 10:48:30] overall_tsv_creation
[INFO] [2019-12-23 10:48:30] Done. Check your files:
[INFO] [2019-12-23 10:48:30] (22622 lines) /app/public/data/gulf_mexico_sp_l/publish_nodes.tsv
[INFO] [2019-12-23 10:48:30] (120314 lines) /app/public/data/gulf_mexico_sp_l/publish_node_ancestors.tsv
[INFO] [2019-12-23 10:48:30] (22622 lines) /app/public/data/gulf_mexico_sp_l/publish_scientific_names.tsv
[INFO] [2019-12-23 10:48:30] (13940 lines) /app/public/data/gulf_mexico_sp_l/publish_traits.tsv
[INFO] [2019-12-23 10:48:30] (69683 lines) /app/public/data/gulf_mexico_sp_l/publish_metadata.tsv
[STOP] [2019-12-23 10:48:31] complete_harvest_instance
[START] [2019-12-23 10:48:31] completed
[STOP] [2019-12-23 10:48:31] completed
[STOP] [2019-12-23 10:48:31] logged process, took 7178.1

Latest Process