Harvest for Luxembourg Species List Created 14 Oct 02:20

Stage: completed
Fetched: 14 Oct 02:20
Validated: 14 Oct 02:20
Deltas Created 14 Oct 02:20
Units Normalized: 14 Oct 02:47
Ancestry Built: 14 Oct 02:25
Nodes Matched: 14 Oct 02:46
Names Parsed: 14 Oct 02:25
New Models Stored: 14 Oct 02:22
Indexed: 14 Oct 02:47
Completed: 14 Oct 02:54
Time to Harvest: 1 minute

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2019-10-14 02:20:39 -0400 by logger.rb/56815
[START] [2019-10-14 02:20:39] logged process
[START] [2019-10-14 02:20:39] create_harvest_instance
[STOP] [2019-10-14 02:20:40] create_harvest_instance
[START] [2019-10-14 02:20:40] fetch_files
[STOP] [2019-10-14 02:20:40] fetch_files
[START] [2019-10-14 02:20:40] validate_each_file
[STOP] [2019-10-14 02:20:42] validate_each_file
[START] [2019-10-14 02:20:42] convert_to_csv
[CMD] [2019-10-14 02:20:42] /usr/bin/sort /app/public/converted_csv/luxembourg_sp_li_refs_16443.csv > /app/public/converted_csv/luxembourg_sp_li_refs_16443.csv_sorted
[CMD] [2019-10-14 02:20:42] /usr/bin/sort /app/public/converted_csv/luxembourg_sp_li_nodes_16444.csv > /app/public/converted_csv/luxembourg_sp_li_nodes_16444.csv_sorted
[CMD] [2019-10-14 02:20:42] /usr/bin/sort /app/public/converted_csv/luxembourg_sp_li_occurrences_16445.csv > /app/public/converted_csv/luxembourg_sp_li_occurrences_16445.csv_sorted
[CMD] [2019-10-14 02:20:43] /usr/bin/sort /app/public/converted_csv/luxembourg_sp_li_measurements_16446.csv > /app/public/converted_csv/luxembourg_sp_li_measurements_16446.csv_sorted
[STOP] [2019-10-14 02:20:43] convert_to_csv
[START] [2019-10-14 02:20:43] calculate_delta
[CMD] [2019-10-14 02:20:43] echo "0a" > /app/public/diff/luxembourg_sp_li_refs_16443.diff
[CMD] [2019-10-14 02:20:43] tail -n +1 /app/public/converted_csv/luxembourg_sp_li_refs_16443.csv >> /app/public/diff/luxembourg_sp_li_refs_16443.diff
[CMD] [2019-10-14 02:20:43] echo "." >> /app/public/diff/luxembourg_sp_li_refs_16443.diff
[CMD] [2019-10-14 02:20:43] echo "0a" > /app/public/diff/luxembourg_sp_li_nodes_16444.diff
[CMD] [2019-10-14 02:20:43] tail -n +1 /app/public/converted_csv/luxembourg_sp_li_nodes_16444.csv >> /app/public/diff/luxembourg_sp_li_nodes_16444.diff
[CMD] [2019-10-14 02:20:43] echo "." >> /app/public/diff/luxembourg_sp_li_nodes_16444.diff
[CMD] [2019-10-14 02:20:43] echo "0a" > /app/public/diff/luxembourg_sp_li_occurrences_16445.diff
[CMD] [2019-10-14 02:20:43] tail -n +1 /app/public/converted_csv/luxembourg_sp_li_occurrences_16445.csv >> /app/public/diff/luxembourg_sp_li_occurrences_16445.diff
[CMD] [2019-10-14 02:20:43] echo "." >> /app/public/diff/luxembourg_sp_li_occurrences_16445.diff
[CMD] [2019-10-14 02:20:44] echo "0a" > /app/public/diff/luxembourg_sp_li_measurements_16446.diff
[CMD] [2019-10-14 02:20:44] tail -n +1 /app/public/converted_csv/luxembourg_sp_li_measurements_16446.csv >> /app/public/diff/luxembourg_sp_li_measurements_16446.diff
[CMD] [2019-10-14 02:20:44] echo "." >> /app/public/diff/luxembourg_sp_li_measurements_16446.diff
[STOP] [2019-10-14 02:20:44] calculate_delta
[START] [2019-10-14 02:20:44] parse_diff_and_store
[INFO] [2019-10-14 02:20:44] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-14 02:20:44] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-14 02:20:52] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-14 02:20:55] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-14 02:22:20] Storing 2 References
[INFO] [2019-10-14 02:22:20] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-14 02:22:20] Average Time: 0.0
[INFO] [2019-10-14 02:22:20] Total Time: 1s
[INFO] [2019-10-14 02:22:20] Storing 21857 ScientificNames
[INFO] [2019-10-14 02:22:20] Processing group of 21857 in 22 groups of 1000
[INFO] [2019-10-14 02:22:29] Average Time: 0.386
[INFO] [2019-10-14 02:22:29] Total Time: 9s
[INFO] [2019-10-14 02:22:29] last 3 / first 3: 0.8
[INFO] [2019-10-14 02:22:29] Std.Dev: 0.10954451150103323; Max: 0.79
[INFO] [2019-10-14 02:22:29] Storing 21857 Nodes
[INFO] [2019-10-14 02:22:29] Processing group of 21857 in 22 groups of 1000
[INFO] [2019-10-14 02:22:35] Average Time: 0.294
[INFO] [2019-10-14 02:22:35] Total Time: 7s
[INFO] [2019-10-14 02:22:35] last 3 / first 3: 0.93
[INFO] [2019-10-14 02:22:35] Std.Dev: 0.0; Max: 0.35
[INFO] [2019-10-14 02:22:35] Storing 14460 Occurrences
[INFO] [2019-10-14 02:22:35] Processing group of 14460 in 15 groups of 1000
[INFO] [2019-10-14 02:22:37] Average Time: 0.114
[INFO] [2019-10-14 02:22:37] Total Time: 2s
[INFO] [2019-10-14 02:22:37] last 3 / first 3: 1.26
[INFO] [2019-10-14 02:22:37] Std.Dev: 0.0; Max: 0.16
[INFO] [2019-10-14 02:22:37] Storing 29550 TraitsReferences
[INFO] [2019-10-14 02:22:37] Processing group of 29550 in 30 groups of 1000
[INFO] [2019-10-14 02:22:40] Average Time: 0.098
[INFO] [2019-10-14 02:22:40] Total Time: 4s
[INFO] [2019-10-14 02:22:40] last 3 / first 3: 0.5
[INFO] [2019-10-14 02:22:40] Std.Dev: 0.05477225575051661; Max: 0.35
[INFO] [2019-10-14 02:22:40] Storing 29549 Traits
[INFO] [2019-10-14 02:22:40] Processing group of 29549 in 30 groups of 1000
[INFO] [2019-10-14 02:22:49] Average Time: 0.289
[INFO] [2019-10-14 02:22:49] Total Time: 9s
[INFO] [2019-10-14 02:22:49] last 3 / first 3: 0.72
[INFO] [2019-10-14 02:22:49] Std.Dev: 0.044721359549995794; Max: 0.41
[INFO] [2019-10-14 02:22:49] Storing 29532 MetaTraits
[INFO] [2019-10-14 02:22:49] Processing group of 29532 in 30 groups of 1000
[INFO] [2019-10-14 02:22:52] Average Time: 0.104
[INFO] [2019-10-14 02:22:52] Total Time: 4s
[INFO] [2019-10-14 02:22:52] last 3 / first 3: 0.76
[INFO] [2019-10-14 02:22:52] Std.Dev: 0.0; Max: 0.14
[STOP] [2019-10-14 02:22:52] parse_diff_and_store
[START] [2019-10-14 02:22:52] resolve_keys
[INFO] [2019-10-14 02:23:58] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-14 02:24:04] traits to occurrences...
[INFO] [2019-10-14 02:24:08] traits to nodes (through occurrences)...
[INFO] [2019-10-14 02:24:08] Traits to sex term...
[INFO] [2019-10-14 02:24:13] Traits to lifestage term...
[INFO] [2019-10-14 02:24:18] MetaTraits to traits...
[INFO] [2019-10-14 02:24:19] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-14 02:24:24] Assocs to occurrences...
[INFO] [2019-10-14 02:24:24] Assocs to nodes...
[INFO] [2019-10-14 02:24:24] Assoc to sex term...
[INFO] [2019-10-14 02:24:24] Assoc to lifestage term...
[STOP] [2019-10-14 02:24:24] resolve_keys
[START] [2019-10-14 02:24:24] hold_for_later_1
[STOP] [2019-10-14 02:24:24] hold_for_later_1
[START] [2019-10-14 02:24:24] hold_for_later_2
[STOP] [2019-10-14 02:24:24] hold_for_later_2
[START] [2019-10-14 02:24:24] resolve_missing_parents
[STOP] [2019-10-14 02:25:04] resolve_missing_parents
[START] [2019-10-14 02:25:04] rebuild_nodes
[START] [2019-10-14 02:25:04] Flattener#flatten
[START] [2019-10-14 02:25:04] Flattener#study_resource
[START] [2019-10-14 02:25:04] Flattener#build_ancestry
[STOP] [2019-10-14 02:25:06] Flattener#build_ancestry
[INFO] [2019-10-14 02:25:06] 21857 ancestry keys
[START] [2019-10-14 02:25:06] build_node_ancestors
[INFO] [2019-10-14 02:25:06] old ancestors deleted.
[STOP] [2019-10-14 02:25:10] build_node_ancestors
[START] [2019-10-14 02:25:14] Flattener#propagate_ancestor_ids
[STOP] [2019-10-14 02:25:15] Flattener#propagate_ancestor_ids
[STOP] [2019-10-14 02:25:15] Flattener#flatten
[STOP] [2019-10-14 02:25:15] rebuild_nodes
[START] [2019-10-14 02:25:15] resolve_missing_media_owners
[STOP] [2019-10-14 02:25:15] resolve_missing_media_owners
[START] [2019-10-14 02:25:15] sanitize_media_verbatims
[STOP] [2019-10-14 02:25:15] sanitize_media_verbatims
[START] [2019-10-14 02:25:15] queue_downloads
[STOP] [2019-10-14 02:25:15] queue_downloads
[START] [2019-10-14 02:25:15] parse_names
[WARN] [2019-10-14 02:25:15] I see 21857 names which still need to be parsed.
[STOP] [2019-10-14 02:25:33] parse_names
[START] [2019-10-14 02:25:33] denormalize_canonical_names_to_nodes
[STOP] [2019-10-14 02:25:33] denormalize_canonical_names_to_nodes
[START] [2019-10-14 02:25:33] match_nodes
[START] [2019-10-14 02:25:33] map_all_nodes_to_pages
[STOP] [2019-10-14 02:46:38] map_all_nodes_to_pages
[INFO] [2019-10-14 02:46:38] 2816 Unmatched nodes (of 21857)! That's too many to output. First 10: Senecio erucifolius (#50575257); Centaurea pratensis (#50575708); Jacobaea jacobaea (#50555194); Jacobaea erucifolius (#50556227); Jacobaea aquaticus (#50556706); Crepis polymorpha (#50558216); Pilosella pilosella (#50555291); Pilosella aurantiacum (#50561250); Pilosella caespitosum (#50564586); Pilosella zizianum (#50565712)
[START] [2019-10-14 02:46:38] update_nodes
[STOP] [2019-10-14 02:46:46] update_nodes
[STOP] [2019-10-14 02:46:46] match_nodes
[START] [2019-10-14 02:46:46] reindex_search
[STOP] [2019-10-14 02:47:35] reindex_search
[START] [2019-10-14 02:47:35] normalize_units
[STOP] [2019-10-14 02:47:35] normalize_units
[START] [2019-10-14 02:47:35] calculate_statistics
[STOP] [2019-10-14 02:47:35] calculate_statistics
[START] [2019-10-14 02:47:35] complete_harvest_instance
[START] [2019-10-14 02:47:35] overall_tsv_creation
[INFO] [2019-10-14 02:47:35] Processing group of 21857 in 3 batches of 10000
[INFO] [2019-10-14 02:49:03] 6013 Traits (unfiltered)...
[INFO] [2019-10-14 02:49:17] 6013 Traits (filtered)...
[INFO] [2019-10-14 02:49:17] 0 Associations (filtered)...
[INFO] [2019-10-14 02:50:08] 30059 metadata added.
[INFO] [2019-10-14 02:50:08] 0 metadata added.
[INFO] [2019-10-14 02:51:39] 7318 Traits (unfiltered)...
[INFO] [2019-10-14 02:51:53] 7318 Traits (filtered)...
[INFO] [2019-10-14 02:51:53] 0 Associations (filtered)...
[INFO] [2019-10-14 02:52:48] 36581 metadata added.
[INFO] [2019-10-14 02:52:48] 0 metadata added.
[INFO] [2019-10-14 02:53:43] 1129 Traits (unfiltered)...
[INFO] [2019-10-14 02:53:57] 1129 Traits (filtered)...
[INFO] [2019-10-14 02:53:57] 0 Associations (filtered)...
[INFO] [2019-10-14 02:54:38] 5642 metadata added.
[INFO] [2019-10-14 02:54:38] 0 metadata added.
[INFO] [2019-10-14 02:54:38] Average Time: 116.497
[INFO] [2019-10-14 02:54:38] Total Time: 7m4s
[STOP] [2019-10-14 02:54:38] overall_tsv_creation
[INFO] [2019-10-14 02:54:38] Done. Check your files:
[INFO] [2019-10-14 02:54:38] (21857 lines) /app/public/data/luxembourg_sp_li/publish_nodes.tsv
[INFO] [2019-10-14 02:54:38] (55819 lines) /app/public/data/luxembourg_sp_li/publish_node_ancestors.tsv
[INFO] [2019-10-14 02:54:38] (21857 lines) /app/public/data/luxembourg_sp_li/publish_scientific_names.tsv
[INFO] [2019-10-14 02:54:38] (14461 lines) /app/public/data/luxembourg_sp_li/publish_traits.tsv
[INFO] [2019-10-14 02:54:38] (72283 lines) /app/public/data/luxembourg_sp_li/publish_metadata.tsv
[STOP] [2019-10-14 02:54:39] complete_harvest_instance
[START] [2019-10-14 02:54:39] completed
[STOP] [2019-10-14 02:54:39] completed
[STOP] [2019-10-14 02:54:39] logged process, took 2039.73

Latest Process