Harvest for Global Register of Introduced and Invasive Species Created 03 Dec 04:15

Stage: completed
Fetched: 03 Dec 04:15
Validated: 03 Dec 04:15
Deltas Created 03 Dec 04:15
Units Normalized: 03 Dec 05:02
Ancestry Built: 03 Dec 04:27
Nodes Matched: 03 Dec 05:01
Names Parsed: 03 Dec 04:27
New Models Stored: 03 Dec 04:25
Indexed: 03 Dec 05:02
Completed: 03 Dec 05:14
Time to Harvest: 1 minute

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2019-12-03 04:15:08 -0500 by logger.rb/56815
[START] [2019-12-03 04:15:08] logged process
[START] [2019-12-03 04:15:08] create_harvest_instance
[STOP] [2019-12-03 04:15:09] create_harvest_instance
[START] [2019-12-03 04:15:09] fetch_files
[STOP] [2019-12-03 04:15:09] fetch_files
[START] [2019-12-03 04:15:09] validate_each_file
[STOP] [2019-12-03 04:15:20] validate_each_file
[START] [2019-12-03 04:15:20] convert_to_csv
[CMD] [2019-12-03 04:15:20] /usr/bin/sort /app/public/converted_csv/griis_nodes_18830.csv > /app/public/converted_csv/griis_nodes_18830.csv_sorted
[CMD] [2019-12-03 04:15:22] /usr/bin/sort /app/public/converted_csv/griis_occurrences_18831.csv > /app/public/converted_csv/griis_occurrences_18831.csv_sorted
[CMD] [2019-12-03 04:15:24] /usr/bin/sort /app/public/converted_csv/griis_measurements_18832.csv > /app/public/converted_csv/griis_measurements_18832.csv_sorted
[STOP] [2019-12-03 04:15:26] convert_to_csv
[START] [2019-12-03 04:15:26] calculate_delta
[CMD] [2019-12-03 04:15:26] echo "0a" > /app/public/diff/griis_nodes_18830.diff
[CMD] [2019-12-03 04:15:27] tail -n +1 /app/public/converted_csv/griis_nodes_18830.csv >> /app/public/diff/griis_nodes_18830.diff
[CMD] [2019-12-03 04:15:29] echo "." >> /app/public/diff/griis_nodes_18830.diff
[CMD] [2019-12-03 04:15:30] echo "0a" > /app/public/diff/griis_occurrences_18831.diff
[CMD] [2019-12-03 04:15:32] tail -n +1 /app/public/converted_csv/griis_occurrences_18831.csv >> /app/public/diff/griis_occurrences_18831.diff
[CMD] [2019-12-03 04:15:34] echo "." >> /app/public/diff/griis_occurrences_18831.diff
[CMD] [2019-12-03 04:15:35] echo "0a" > /app/public/diff/griis_measurements_18832.diff
[CMD] [2019-12-03 04:15:37] tail -n +1 /app/public/converted_csv/griis_measurements_18832.csv >> /app/public/diff/griis_measurements_18832.diff
[CMD] [2019-12-03 04:15:39] echo "." >> /app/public/diff/griis_measurements_18832.diff
[STOP] [2019-12-03 04:15:40] calculate_delta
[START] [2019-12-03 04:15:40] parse_diff_and_store
[INFO] [2019-12-03 04:15:42] Loading nodes diff file into memory (true lines)...
[WARN] [2019-12-03 04:15:43] Filtered Scientific Name `Tridacna gigas  (Linnaeus, 1758)` to `Tridacna gigas (Linnaeus, 1758)`
[WARN] [2019-12-03 04:15:43] Filtered Scientific Name `Erechtites valerianaefolius  DC. (Link ex Spreng.)` to `Erechtites valerianaefolius DC. (Link ex Spreng.)`
[WARN] [2019-12-03 04:15:43] Filtered Scientific Name `Rhopobota naevana Hübner, 1814/17` to `Rhopobota naevana Hübner, 181417`
[WARN] [2019-12-03 04:15:44] Filtered Scientific Name `Gonaxis kibweziensis  (Smith, 1894)` to `Gonaxis kibweziensis (Smith, 1894)`
[WARN] [2019-12-03 04:15:45] Filtered Scientific Name `Begonia "haageana"` to `Begonia haageana`
[WARN] [2019-12-03 04:15:45] Filtered Scientific Name `Passiflora bicornis  Mill.` to `Passiflora bicornis Mill.`
[WARN] [2019-12-03 04:15:46] Filtered Scientific Name `Costus pulverulentus  C.Presl` to `Costus pulverulentus C.Presl`
[WARN] [2019-12-03 04:15:48] Filtered Scientific Name `Monia nobilis  (Reeve, 1859)` to `Monia nobilis (Reeve, 1859)`
[WARN] [2019-12-03 04:15:49] Filtered Scientific Name `Gobius couchi  Miller & El-Tawil, 1974` to `Gobius couchi Miller & El-Tawil, 1974`
[WARN] [2019-12-03 04:15:50] Filtered Scientific Name `Parupeneus forsskali  (Fourmanoir & Guézé, 1976)` to `Parupeneus forsskali (Fourmanoir & Guézé, 1976)`
[WARN] [2019-12-03 04:15:50] Filtered Scientific Name `Pennisetum gIaucum  (L) R.Br.` to `Pennisetum gIaucum (L) R.Br.`
[WARN] [2019-12-03 04:15:51] Filtered Scientific Name `Amynthas diffringens  (Baird, 1869)` to `Amynthas diffringens (Baird, 1869)`
[WARN] [2019-12-03 04:15:52] Filtered Scientific Name `Streptostele musaecola  (Morelet, 1860)` to `Streptostele musaecola (Morelet, 1860)`
[INFO] [2019-12-03 04:15:54] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-12-03 04:17:11] Loading measurements diff file into memory (true lines)...
[INFO] [2019-12-03 04:24:04] Storing 20952 ScientificNames
[INFO] [2019-12-03 04:24:04] Processing group of 20952 in 21 groups of 1000
[INFO] [2019-12-03 04:24:14] Average Time: 0.462
[INFO] [2019-12-03 04:24:14] Total Time: 10s
[INFO] [2019-12-03 04:24:14] last 3 / first 3: 0.87
[INFO] [2019-12-03 04:24:14] Std.Dev: 0.3; Max: 1.74
[INFO] [2019-12-03 04:24:14] Storing 20952 Nodes
[INFO] [2019-12-03 04:24:14] Processing group of 20952 in 21 groups of 1000
[INFO] [2019-12-03 04:24:20] Average Time: 0.306
[INFO] [2019-12-03 04:24:20] Total Time: 7s
[INFO] [2019-12-03 04:24:20] last 3 / first 3: 0.98
[INFO] [2019-12-03 04:24:20] Std.Dev: 0.03162277660168379; Max: 0.38
[INFO] [2019-12-03 04:24:20] Storing 63604 Occurrences
[INFO] [2019-12-03 04:24:20] Processing group of 63604 in 64 groups of 1000
[INFO] [2019-12-03 04:24:32] Average Time: 0.173
[INFO] [2019-12-03 04:24:32] Total Time: 12s
[INFO] [2019-12-03 04:24:32] last 3 / first 3: 0.79
[INFO] [2019-12-03 04:24:32] Std.Dev: 0.31937438845342625; Max: 2.15
[INFO] [2019-12-03 04:24:32] Storing 58867 OccurrenceMetadata
[INFO] [2019-12-03 04:24:32] Processing group of 58867 in 59 groups of 1000
[INFO] [2019-12-03 04:24:39] Average Time: 0.11
[INFO] [2019-12-03 04:24:39] Total Time: 7s
[INFO] [2019-12-03 04:24:39] last 3 / first 3: 0.83
[INFO] [2019-12-03 04:24:39] Std.Dev: 0.0; Max: 0.2
[INFO] [2019-12-03 04:24:39] Storing 92282 Traits
[INFO] [2019-12-03 04:24:39] Processing group of 92282 in 93 groups of 1000
[INFO] [2019-12-03 04:25:12] Average Time: 0.353
[INFO] [2019-12-03 04:25:12] Total Time: 34s
[INFO] [2019-12-03 04:25:12] last 3 / first 3: 0.67
[INFO] [2019-12-03 04:25:12] Std.Dev: 0.2; Max: 1.59
[INFO] [2019-12-03 04:25:12] Storing 189665 MetaTraits
[INFO] [2019-12-03 04:25:12] Processing group of 189665 in 190 groups of 1000
[INFO] [2019-12-03 04:25:39] Average Time: 0.139
[INFO] [2019-12-03 04:25:39] Total Time: 28s
[INFO] [2019-12-03 04:25:39] last 3 / first 3: 1.25
[INFO] [2019-12-03 04:25:39] Std.Dev: 0.24083189157584592; Max: 2.67
[STOP] [2019-12-03 04:25:39] parse_diff_and_store
[START] [2019-12-03 04:25:39] resolve_keys
[INFO] [2019-12-03 04:25:57] Occurrences to nodes (through scientific_names)...
[INFO] [2019-12-03 04:26:01] traits to occurrences...
[INFO] [2019-12-03 04:26:11] traits to nodes (through occurrences)...
[INFO] [2019-12-03 04:26:13] Traits to sex term...
[INFO] [2019-12-03 04:26:15] Traits to lifestage term...
[INFO] [2019-12-03 04:26:17] MetaTraits to traits...
[INFO] [2019-12-03 04:26:29] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-12-03 04:26:29] Assocs to occurrences...
[INFO] [2019-12-03 04:26:29] Assocs to nodes...
[INFO] [2019-12-03 04:26:29] Assoc to sex term...
[INFO] [2019-12-03 04:26:29] Assoc to lifestage term...
[STOP] [2019-12-03 04:26:29] resolve_keys
[START] [2019-12-03 04:26:29] hold_for_later_1
[STOP] [2019-12-03 04:26:29] hold_for_later_1
[START] [2019-12-03 04:26:29] hold_for_later_2
[STOP] [2019-12-03 04:26:29] hold_for_later_2
[START] [2019-12-03 04:26:29] resolve_missing_parents
[STOP] [2019-12-03 04:26:36] resolve_missing_parents
[START] [2019-12-03 04:26:36] rebuild_nodes
[START] [2019-12-03 04:26:36] Flattener#flatten
[START] [2019-12-03 04:26:36] Flattener#study_resource
[START] [2019-12-03 04:26:36] Flattener#build_ancestry
[STOP] [2019-12-03 04:26:42] Flattener#build_ancestry
[INFO] [2019-12-03 04:26:42] 20952 ancestry keys
[START] [2019-12-03 04:26:42] build_node_ancestors
[INFO] [2019-12-03 04:26:42] old ancestors deleted.
[STOP] [2019-12-03 04:26:57] build_node_ancestors
[START] [2019-12-03 04:26:57] Flattener#propagate_ancestor_ids
[STOP] [2019-12-03 04:27:00] Flattener#propagate_ancestor_ids
[STOP] [2019-12-03 04:27:00] Flattener#flatten
[STOP] [2019-12-03 04:27:00] rebuild_nodes
[START] [2019-12-03 04:27:00] resolve_missing_media_owners
[STOP] [2019-12-03 04:27:00] resolve_missing_media_owners
[START] [2019-12-03 04:27:00] sanitize_media_verbatims
[STOP] [2019-12-03 04:27:00] sanitize_media_verbatims
[START] [2019-12-03 04:27:00] queue_downloads
[STOP] [2019-12-03 04:27:00] queue_downloads
[START] [2019-12-03 04:27:00] parse_names
[WARN] [2019-12-03 04:27:00] I see 20952 names which still need to be parsed.
[WARN] [2019-12-03 04:27:17] I see 87 names which still need to be parsed.
[WARN] [2019-12-03 04:27:18] I see 5 names which still need to be parsed.
[STOP] [2019-12-03 04:27:20] parse_names
[START] [2019-12-03 04:27:20] denormalize_canonical_names_to_nodes
[STOP] [2019-12-03 04:27:20] denormalize_canonical_names_to_nodes
[START] [2019-12-03 04:27:20] match_nodes
[START] [2019-12-03 04:27:20] map_all_nodes_to_pages
[STOP] [2019-12-03 05:01:23] map_all_nodes_to_pages
[INFO] [2019-12-03 05:01:23] 1046 Unmatched nodes (of 20952)! That's too many to output. First 10: Cryptoblabes gnidiella (#59429313); Plodia interpunctella (#59433970); Eodiatraea rufescens (#59443059); Hypanis glabra (#59435413); Hypanis colorata (#59435785); Hypanis fragilis (#59437387); Hypanis pontica (#59442933); Vanessa cardui (#59445040); Nomophila noctuella (#59437810); Eoophyla bilinealis (#59439309)
[START] [2019-12-03 05:01:23] update_nodes
[STOP] [2019-12-03 05:01:31] update_nodes
[STOP] [2019-12-03 05:01:31] match_nodes
[START] [2019-12-03 05:01:31] reindex_search
[STOP] [2019-12-03 05:02:14] reindex_search
[START] [2019-12-03 05:02:14] normalize_units
[STOP] [2019-12-03 05:02:14] normalize_units
[START] [2019-12-03 05:02:14] calculate_statistics
[STOP] [2019-12-03 05:02:14] calculate_statistics
[START] [2019-12-03 05:02:14] complete_harvest_instance
[START] [2019-12-03 05:02:14] overall_tsv_creation
[INFO] [2019-12-03 05:02:14] Processing group of 20952 in 3 batches of 10000
[INFO] [2019-12-03 05:03:46] 40203 Traits (unfiltered)...
[INFO] [2019-12-03 05:04:00] 40203 Traits (filtered)...
[INFO] [2019-12-03 05:04:00] 0 Associations (filtered)...
[INFO] [2019-12-03 05:07:02] 108149 metadata added.
[INFO] [2019-12-03 05:07:02] 0 metadata added.
[INFO] [2019-12-03 05:08:40] 47872 Traits (unfiltered)...
[INFO] [2019-12-03 05:08:54] 47872 Traits (filtered)...
[INFO] [2019-12-03 05:08:54] 0 Associations (filtered)...
[INFO] [2019-12-03 05:12:20] 129070 metadata added.
[INFO] [2019-12-03 05:12:20] 0 metadata added.
[INFO] [2019-12-03 05:13:12] 4202 Traits (unfiltered)...
[INFO] [2019-12-03 05:13:25] 4202 Traits (filtered)...
[INFO] [2019-12-03 05:13:25] 0 Associations (filtered)...
[INFO] [2019-12-03 05:14:09] 11418 metadata added.
[INFO] [2019-12-03 05:14:09] 0 metadata added.
[INFO] [2019-12-03 05:14:09] Average Time: 209.597
[INFO] [2019-12-03 05:14:09] Total Time: 11m56s
[STOP] [2019-12-03 05:14:09] overall_tsv_creation
[INFO] [2019-12-03 05:14:09] Done. Check your files:
[INFO] [2019-12-03 05:14:11] (20866 lines) /app/public/data/griis/publish_nodes.tsv
[INFO] [2019-12-03 05:14:13] (102955 lines) /app/public/data/griis/publish_node_ancestors.tsv
[INFO] [2019-12-03 05:14:14] (20952 lines) /app/public/data/griis/publish_scientific_names.tsv
[INFO] [2019-12-03 05:14:16] (92278 lines) /app/public/data/griis/publish_traits.tsv
[INFO] [2019-12-03 05:14:18] (248638 lines) /app/public/data/griis/publish_metadata.tsv
[STOP] [2019-12-03 05:14:18] complete_harvest_instance
[START] [2019-12-03 05:14:18] completed
[STOP] [2019-12-03 05:14:18] completed
[STOP] [2019-12-03 05:14:18] logged process, took 3550.5

Latest Process