Harvest for Global Register of Introduced and Invasive Species Created 04 Apr 13:07

Stage: completed
Fetched: 04 Apr 13:07
Validated: 04 Apr 13:07
Deltas Created 04 Apr 13:07
Units Normalized: 04 Apr 13:17
Ancestry Built: 04 Apr 13:10
Nodes Matched: 04 Apr 13:17
Names Parsed: 04 Apr 13:10
New Models Stored: 04 Apr 13:10
Indexed: 04 Apr 13:17
Completed: 04 Apr 13:31
Time to Harvest: less than a minute

Harvesting Log

(1521 lines) (showing only the last 1000 lines, see /app/public/data/griis/process.log for the full file)
[INFO] [2020-05-16 00:01:22] last 3 / first 3: 0.91
[INFO] [2020-05-16 00:01:22] Std.Dev: 0.03162277660168379; Max: 0.25
[INFO] [2020-05-16 00:01:22] Storing 105878 OccurrenceMetadata
[INFO] [2020-05-16 00:01:22] Processing group of 105878 in 106 groups of 1000
[INFO] [2020-05-16 00:01:39] Average Time: 0.153
[INFO] [2020-05-16 00:01:39] Total Time: 17s
[INFO] [2020-05-16 00:01:39] last 3 / first 3: 0.87
[INFO] [2020-05-16 00:01:39] Std.Dev: 0.4219004621945797; Max: 3.2
[INFO] [2020-05-16 00:01:39] Storing 108348 Traits
[INFO] [2020-05-16 00:01:39] Processing group of 108348 in 109 groups of 1000
[INFO] [2020-05-16 00:02:12] Average Time: 0.304
[INFO] [2020-05-16 00:02:12] Total Time: 34s
[INFO] [2020-05-16 00:02:12] last 3 / first 3: 0.85
[INFO] [2020-05-16 00:02:12] Std.Dev: 0.3255764119219941; Max: 3.65
[INFO] [2020-05-16 00:02:12] Storing 216696 MetaTraits
[INFO] [2020-05-16 00:02:12] Processing group of 216696 in 217 groups of 1000
[INFO] [2020-05-16 00:02:37] Average Time: 0.112
[INFO] [2020-05-16 00:02:37] Total Time: 26s
[INFO] [2020-05-16 00:02:37] last 3 / first 3: 0.97
[INFO] [2020-05-16 00:02:37] Std.Dev: 0.22803508501982758; Max: 3.45
[STOP] [2020-05-16 00:02:37] parse_diff_and_store
[START] [2020-05-16 00:02:37] resolve_keys
[INFO] [2020-05-16 00:03:10] Occurrences to nodes (through scientific_names)...
[INFO] [2020-05-16 00:03:24] traits to occurrences...
[INFO] [2020-05-16 00:03:34] traits to nodes (through occurrences)...
[INFO] [2020-05-16 00:03:36] Traits to sex term...
[INFO] [2020-05-16 00:03:38] Traits to lifestage term...
[INFO] [2020-05-16 00:03:41] MetaTraits to traits...
[INFO] [2020-05-16 00:03:55] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-05-16 00:03:56] Assocs to occurrences...
[INFO] [2020-05-16 00:03:56] Assocs to nodes...
[INFO] [2020-05-16 00:03:56] Assoc to sex term...
[INFO] [2020-05-16 00:03:56] Assoc to lifestage term...
[STOP] [2020-05-16 00:03:56] resolve_keys
[START] [2020-05-16 00:03:56] hold_for_later_1
[STOP] [2020-05-16 00:03:56] hold_for_later_1
[START] [2020-05-16 00:03:56] hold_for_later_2
[STOP] [2020-05-16 00:03:56] hold_for_later_2
[START] [2020-05-16 00:03:56] resolve_missing_parents
[STOP] [2020-05-16 00:04:00] resolve_missing_parents
[START] [2020-05-16 00:04:00] rebuild_nodes
[START] [2020-05-16 00:04:00] Flattener#flatten
[START] [2020-05-16 00:04:00] Flattener#study_resource
[START] [2020-05-16 00:04:01] Flattener#build_ancestry
[STOP] [2020-05-16 00:04:02] Flattener#build_ancestry
[INFO] [2020-05-16 00:04:02] 18987 ancestry keys
[START] [2020-05-16 00:04:02] build_node_ancestors
[INFO] [2020-05-16 00:04:02] old ancestors deleted.
[STOP] [2020-05-16 00:04:06] build_node_ancestors
[START] [2020-05-16 00:04:12] Flattener#propagate_ancestor_ids
[STOP] [2020-05-16 00:04:14] Flattener#propagate_ancestor_ids
[STOP] [2020-05-16 00:04:14] Flattener#flatten
[STOP] [2020-05-16 00:04:14] rebuild_nodes
[START] [2020-05-16 00:04:14] resolve_missing_media_owners
[STOP] [2020-05-16 00:04:14] resolve_missing_media_owners
[START] [2020-05-16 00:04:14] sanitize_media_verbatims
[STOP] [2020-05-16 00:04:14] sanitize_media_verbatims
[START] [2020-05-16 00:04:14] queue_downloads
[STOP] [2020-05-16 00:04:14] queue_downloads
[START] [2020-05-16 00:04:14] parse_names
[WARN] [2020-05-16 00:04:14] I see 18987 names which still need to be parsed.
[WARN] [2020-05-16 00:04:33] I see 85 names which still need to be parsed.
[WARN] [2020-05-16 00:04:34] I see 8 names which still need to be parsed.
[WARN] [2020-05-16 00:04:35] I see 2 names which still need to be parsed.
[STOP] [2020-05-16 00:04:36] parse_names
[START] [2020-05-16 00:04:36] denormalize_canonical_names_to_nodes
[STOP] [2020-05-16 00:04:36] denormalize_canonical_names_to_nodes
[START] [2020-05-16 00:04:36] match_nodes
[START] [2020-05-16 00:04:37] map_all_nodes_to_pages
[STOP] [2020-05-16 00:35:39] map_all_nodes_to_pages
[INFO] [2020-05-16 00:35:39] 741 Unmatched nodes (of 18987)! That's too many to output. First 10: Magnoliopsida (#77566903); Acacia truncata (#77567418); Vicia ervilla (#77567607); Acacia (#77568230); Acacia baileyana × Acacia leucoclada (#77568429); Acacia leptocarpa (#77568718); Trigonella corniculata (#77568955); Medicago laciniata (#77569024); Trifolium procumbens (#77569205); Acacia auriculiformis (#77569463)
[START] [2020-05-16 00:35:39] update_nodes
[STOP] [2020-05-16 00:35:47] update_nodes
[STOP] [2020-05-16 00:35:47] match_nodes
[START] [2020-05-16 00:35:47] reindex_search
[STOP] [2020-05-16 00:36:18] reindex_search
[START] [2020-05-16 00:36:18] normalize_units
[STOP] [2020-05-16 00:36:53] normalize_units
[START] [2020-05-16 00:36:53] calculate_statistics
[STOP] [2020-05-16 00:36:53] calculate_statistics
[START] [2020-05-16 00:36:53] complete_harvest_instance
[START] [2020-05-16 00:36:53] overall_tsv_creation
[INFO] [2020-05-16 00:36:53] Processing group of 18987 in 2 batches of 10000
[INFO] [2020-05-16 00:38:18] 53364 Traits (unfiltered)...
[INFO] [2020-05-16 00:38:31] 53364 Traits (filtered)...
[INFO] [2020-05-16 00:38:31] 0 Associations (filtered)...
[INFO] [2020-05-16 00:41:18] 159066 metadata added.
[INFO] [2020-05-16 00:41:18] 0 metadata added.
[INFO] [2020-05-16 00:42:40] 54984 Traits (unfiltered)...
[INFO] [2020-05-16 00:42:53] 54984 Traits (filtered)...
[INFO] [2020-05-16 00:42:53] 0 Associations (filtered)...
[INFO] [2020-05-16 00:45:43] 163918 metadata added.
[INFO] [2020-05-16 00:45:43] 0 metadata added.
[INFO] [2020-05-16 00:45:43] Average Time: 228.265
[INFO] [2020-05-16 00:45:43] Total Time: 8m50s
[STOP] [2020-05-16 00:45:43] overall_tsv_creation
[INFO] [2020-05-16 00:45:43] Done. Check your files:
[INFO] [2020-05-16 00:45:43] (18905 lines) /app/public/data/griis/publish_nodes.tsv
[INFO] [2020-05-16 00:45:43] (90541 lines) /app/public/data/griis/publish_node_ancestors.tsv
[INFO] [2020-05-16 00:45:43] (18987 lines) /app/public/data/griis/publish_scientific_names.tsv
[INFO] [2020-05-16 00:45:43] (108349 lines) /app/public/data/griis/publish_traits.tsv
[INFO] [2020-05-16 00:45:43] (322985 lines) /app/public/data/griis/publish_metadata.tsv
[STOP] [2020-05-16 00:45:44] complete_harvest_instance
[START] [2020-05-16 00:45:44] completed
[STOP] [2020-05-16 00:45:44] completed
[STOP] [2020-05-16 00:45:44] logged process, took 3293.78
[INFO] [2020-10-29 18:13:17] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-10-29 18:13:20] ## remove_type: ScientificName
[INFO] [2020-10-29 18:13:20] ++ Calling delete_all on 18987 instances...
[INFO] [2020-10-29 18:13:21] [18:13:21.980] Removed 18987 Scientificnames
[INFO] [2020-10-29 18:13:21] ## remove_type: Vernacular
[INFO] [2020-10-29 18:13:21] ++ Calling delete_all on 0 instances...
[INFO] [2020-10-29 18:13:21] [18:13:21.983] Removed 0 Vernaculars
[INFO] [2020-10-29 18:13:21] ## remove_type: Article
[INFO] [2020-10-29 18:13:21] ++ Calling delete_all on 0 instances...
[INFO] [2020-10-29 18:13:21] [18:13:21.986] Removed 0 Articles
[INFO] [2020-10-29 18:13:21] ## remove_type: Medium
[INFO] [2020-10-29 18:13:21] ++ Calling delete_all on 0 instances...
[INFO] [2020-10-29 18:13:21] [18:13:21.990] Removed 0 Media
[INFO] [2020-10-29 18:13:21] ## remove_type: Trait
[INFO] [2020-10-29 18:13:22] ++ Calling delete_all on 108348 instances...
[INFO] [2020-10-29 18:13:42] [18:13:42.483] Removed 108348 Traits
[INFO] [2020-10-29 18:13:42] ## remove_type: MetaTrait
[INFO] [2020-10-29 18:13:44] ++ Calling delete_all on 216696 instances...
[INFO] [2020-10-29 18:14:15] [18:14:15.782] Removed 216696 Metatraits
[INFO] [2020-10-29 18:14:15] ## remove_type: OccurrenceMetadatum
[INFO] [2020-10-29 18:14:16] ++ Calling delete_all on 105878 instances...
[INFO] [2020-10-29 18:14:21] [18:14:21.710] Removed 105878 Occurrencemetadata
[INFO] [2020-10-29 18:14:21] ## remove_type: Assoc
[INFO] [2020-10-29 18:14:21] ++ Calling delete_all on 0 instances...
[INFO] [2020-10-29 18:14:21] [18:14:21.714] Removed 0 Assocs
[INFO] [2020-10-29 18:14:21] ## remove_type: MetaAssoc
[INFO] [2020-10-29 18:14:21] ++ Calling delete_all on 0 instances...
[INFO] [2020-10-29 18:14:21] [18:14:21.716] Removed 0 Metaassocs
[INFO] [2020-10-29 18:14:21] ## remove_type: Identifier
[INFO] [2020-10-29 18:14:21] ++ Calling delete_all on 0 instances...
[INFO] [2020-10-29 18:14:21] [18:14:21.734] Removed 0 Identifiers
[INFO] [2020-10-29 18:14:21] ## remove_type: Reference
[INFO] [2020-10-29 18:14:21] ++ Calling delete_all on 0 instances...
[INFO] [2020-10-29 18:14:21] [18:14:21.737] Removed 0 References
[INFO] [2020-10-29 18:14:22] Starting batch with ID 77567664...
[INFO] [2020-10-29 18:14:23] Starting batch with ID 77567664...
[INFO] [2020-10-29 18:14:24] Starting batch with ID 77579449...
[INFO] [2020-10-29 18:14:25] Starting batch with ID 77579449...
[INFO] [2020-10-29 18:14:25] Starting batch with ID 77583398...
[INFO] [2020-10-29 18:14:26] Starting batch with ID 77583453...
[INFO] [2020-10-29 18:14:27] Starting batch with ID 77578530...
[INFO] [2020-10-29 18:14:28] Starting batch with ID 77578530...
[INFO] [2020-10-29 18:14:28] ## remove_type: Node
[INFO] [2020-10-29 18:14:28] ++ Calling delete_all on 18987 instances...
[INFO] [2020-10-29 18:14:29] [18:14:29.984] Removed 18987 Nodes
[START] [2020-10-29 18:14:43] logged process: e6e2b4aab868de10c61110399db57269c6f7c6ba

[START] [2020-10-29 18:14:43] Creating resource from OpenData
[START] [2020-10-29 18:14:44] logged process: e6e2b4aab868de10c61110399db57269c6f7c6ba

[START] [2020-10-29 18:14:44] Parse meta.xml file and create formats with fields
[STOP] [2020-10-29 18:14:44] Parse meta.xml file and create formats with fields
[STOP] [2020-10-29 18:14:44] Creating resource from OpenData
[START] [2020-10-29 18:14:44] logged process: e6e2b4aab868de10c61110399db57269c6f7c6ba

[START] [2020-10-29 18:14:45] create_harvest_instance
[STOP] [2020-10-29 18:14:49] create_harvest_instance
[START] [2020-10-29 18:14:49] fetch_files
[STOP] [2020-10-29 18:14:49] fetch_files
[START] [2020-10-29 18:14:49] validate_each_file
[STOP] [2020-10-29 18:14:53] validate_each_file
[START] [2020-10-29 18:14:53] convert_to_csv
[CMD] [2020-10-29 18:14:53] /usr/bin/sort /app/public/converted_csv/griis_nodes_23002.csv > /app/public/converted_csv/griis_nodes_23002.csv_sorted
[CMD] [2020-10-29 18:14:53] /usr/bin/sort /app/public/converted_csv/griis_occurrences_23003.csv > /app/public/converted_csv/griis_occurrences_23003.csv_sorted
[CMD] [2020-10-29 18:14:53] /usr/bin/sort /app/public/converted_csv/griis_measurements_23004.csv > /app/public/converted_csv/griis_measurements_23004.csv_sorted
[STOP] [2020-10-29 18:14:53] convert_to_csv
[START] [2020-10-29 18:14:53] calculate_delta
[CMD] [2020-10-29 18:14:53] echo "0a" > /app/public/diff/griis_nodes_23002.diff
[CMD] [2020-10-29 18:14:53] tail -n +1 /app/public/converted_csv/griis_nodes_23002.csv >> /app/public/diff/griis_nodes_23002.diff
[CMD] [2020-10-29 18:14:53] echo "." >> /app/public/diff/griis_nodes_23002.diff
[CMD] [2020-10-29 18:14:53] echo "0a" > /app/public/diff/griis_occurrences_23003.diff
[CMD] [2020-10-29 18:14:53] tail -n +1 /app/public/converted_csv/griis_occurrences_23003.csv >> /app/public/diff/griis_occurrences_23003.diff
[CMD] [2020-10-29 18:14:53] echo "." >> /app/public/diff/griis_occurrences_23003.diff
[CMD] [2020-10-29 18:14:53] echo "0a" > /app/public/diff/griis_measurements_23004.diff
[CMD] [2020-10-29 18:14:53] tail -n +1 /app/public/converted_csv/griis_measurements_23004.csv >> /app/public/diff/griis_measurements_23004.diff
[CMD] [2020-10-29 18:14:53] echo "." >> /app/public/diff/griis_measurements_23004.diff
[STOP] [2020-10-29 18:14:53] calculate_delta
[START] [2020-10-29 18:14:53] parse_diff_and_store
[INFO] [2020-10-29 18:14:54] Loading nodes diff file into memory (true lines)...
[WARN] [2020-10-29 18:14:54] New Taxonomic status: species; treatings as unusable...
[WARN] [2020-10-29 18:14:54] Filtered Scientific Name `Circenita varia  (Born, 1778)` to `Circenita varia (Born, 1778)`
[WARN] [2020-10-29 18:14:54] New Taxonomic status: variety; treatings as unusable...
[WARN] [2020-10-29 18:14:55] Filtered Scientific Name `Dialeurodes citri  (Ashmed, 1885)` to `Dialeurodes citri (Ashmed, 1885)`
[WARN] [2020-10-29 18:14:55] Filtered Scientific Name `Belucia acinanthera Triana  Synonim: Bellucia pentamera Naudin Bellucia costariensis Cogn.` to `Belucia acinanthera Triana Synonim: Bellucia pentamera Naudin Bellucia costariensis Cogn.`
[WARN] [2020-10-29 18:14:56] Filtered Scientific Name `Priacanthus sagittarius  Starnes, 1988` to `Priacanthus sagittarius Starnes, 1988`
[WARN] [2020-10-29 18:14:57] Filtered Scientific Name `Egeria dense  Planch.` to `Egeria dense Planch.`
[WARN] [2020-10-29 18:14:58] Filtered Scientific Name `Glyphodes pyloalis  (Walker,1859)` to `Glyphodes pyloalis (Walker,1859)`
[WARN] [2020-10-29 18:14:58] Filtered Scientific Name `Alpheus edwardsii  (Audouin, 1826)` to `Alpheus edwardsii (Audouin, 1826)`
[WARN] [2020-10-29 18:14:58] Filtered Scientific Name `Allotropa  burrelli Muesebeck, 1942` to `Allotropa burrelli Muesebeck, 1942`
[WARN] [2020-10-29 18:14:59] Filtered Scientific Name `Allotropa  convexifrons Muesebeck, 1943` to `Allotropa convexifrons Muesebeck, 1943`
[INFO] [2020-10-29 18:14:59] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-10-29 18:15:14] Loading measurements diff file into memory (true lines)...
[INFO] [2020-10-29 18:15:45] Storing 17485 ScientificNames
[INFO] [2020-10-29 18:15:45] Processing group of 17485 in 18 groups of 1000
[INFO] [2020-10-29 18:15:51] Average Time: 0.292
[INFO] [2020-10-29 18:15:51] Total Time: 6s
[INFO] [2020-10-29 18:15:51] last 3 / first 3: 0.81
[INFO] [2020-10-29 18:15:51] Std.Dev: 0.044721359549995794; Max: 0.39
[INFO] [2020-10-29 18:15:51] Storing 17485 Nodes
[INFO] [2020-10-29 18:15:51] Processing group of 17485 in 18 groups of 1000
[INFO] [2020-10-29 18:15:57] Average Time: 0.329
[INFO] [2020-10-29 18:15:57] Total Time: 7s
[INFO] [2020-10-29 18:15:57] last 3 / first 3: 0.6
[INFO] [2020-10-29 18:15:57] Std.Dev: 0.11832159566199232; Max: 0.56
[INFO] [2020-10-29 18:15:57] Storing 57655 Occurrences
[INFO] [2020-10-29 18:15:57] Processing group of 57655 in 58 groups of 1000
[INFO] [2020-10-29 18:16:06] Average Time: 0.152
[INFO] [2020-10-29 18:16:06] Total Time: 10s
[INFO] [2020-10-29 18:16:06] last 3 / first 3: 1.18
[INFO] [2020-10-29 18:16:06] Std.Dev: 0.11401754250991379; Max: 0.87
[INFO] [2020-10-29 18:16:06] Storing 85980 OccurrenceMetadata
[INFO] [2020-10-29 18:16:06] Processing group of 85980 in 86 groups of 1000
[INFO] [2020-10-29 18:16:16] Average Time: 0.115
[INFO] [2020-10-29 18:16:16] Total Time: 11s
[INFO] [2020-10-29 18:16:16] last 3 / first 3: 1.03
[INFO] [2020-10-29 18:16:16] Std.Dev: 0.0; Max: 0.24
[INFO] [2020-10-29 18:16:16] Storing 85499 Traits
[INFO] [2020-10-29 18:16:16] Processing group of 85499 in 86 groups of 1000
[INFO] [2020-10-29 18:16:52] Average Time: 0.418
[INFO] [2020-10-29 18:16:52] Total Time: 37s
[INFO] [2020-10-29 18:16:52] last 3 / first 3: 0.69
[INFO] [2020-10-29 18:16:52] Std.Dev: 0.3924283374069717; Max: 2.4
[INFO] [2020-10-29 18:16:52] Storing 85499 MetaTraits
[INFO] [2020-10-29 18:16:52] Processing group of 85499 in 86 groups of 1000
[INFO] [2020-10-29 18:17:03] Average Time: 0.12
[INFO] [2020-10-29 18:17:03] Total Time: 11s
[INFO] [2020-10-29 18:17:03] last 3 / first 3: 0.85
[INFO] [2020-10-29 18:17:03] Std.Dev: 0.03162277660168379; Max: 0.26
[STOP] [2020-10-29 18:17:03] parse_diff_and_store
[START] [2020-10-29 18:17:03] resolve_keys
[INFO] [2020-10-29 18:17:15] Occurrences to nodes (through scientific_names)...
[INFO] [2020-10-29 18:17:18] traits to occurrences...
[INFO] [2020-10-29 18:17:21] traits to nodes (through occurrences)...
[INFO] [2020-10-29 18:17:22] Traits to sex term...
[INFO] [2020-10-29 18:17:24] Traits to lifestage term...
[INFO] [2020-10-29 18:17:25] MetaTraits to traits...
[INFO] [2020-10-29 18:17:27] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-10-29 18:17:27] Assocs to occurrences...
[INFO] [2020-10-29 18:17:27] Assocs to nodes...
[INFO] [2020-10-29 18:17:27] Assoc to sex term...
[INFO] [2020-10-29 18:17:27] Assoc to lifestage term...
[INFO] [2020-10-29 18:17:27] MetaAssoc to assocs...
[STOP] [2020-10-29 18:17:27] resolve_keys
[START] [2020-10-29 18:17:27] hold_for_later_1
[STOP] [2020-10-29 18:17:27] hold_for_later_1
[START] [2020-10-29 18:17:27] hold_for_later_2
[STOP] [2020-10-29 18:17:27] hold_for_later_2
[START] [2020-10-29 18:17:27] resolve_missing_parents
[STOP] [2020-10-29 18:17:29] resolve_missing_parents
[START] [2020-10-29 18:17:29] rebuild_nodes
[START] [2020-10-29 18:17:29] Flattener#flatten
[START] [2020-10-29 18:17:29] Flattener#study_resource
[START] [2020-10-29 18:17:29] Flattener#build_ancestry
[STOP] [2020-10-29 18:17:32] Flattener#build_ancestry
[INFO] [2020-10-29 18:17:32] 17485 ancestry keys
[START] [2020-10-29 18:17:32] build_node_ancestors
[INFO] [2020-10-29 18:17:32] old ancestors deleted.
[STOP] [2020-10-29 18:17:35] build_node_ancestors
[START] [2020-10-29 18:17:40] Flattener#propagate_ancestor_ids
[STOP] [2020-10-29 18:17:42] Flattener#propagate_ancestor_ids
[STOP] [2020-10-29 18:17:42] Flattener#flatten
[STOP] [2020-10-29 18:17:42] rebuild_nodes
[START] [2020-10-29 18:17:42] resolve_missing_media_owners
[STOP] [2020-10-29 18:17:42] resolve_missing_media_owners
[START] [2020-10-29 18:17:42] sanitize_media_verbatims
[STOP] [2020-10-29 18:17:42] sanitize_media_verbatims
[START] [2020-10-29 18:17:42] queue_downloads
[STOP] [2020-10-29 18:17:42] queue_downloads
[START] [2020-10-29 18:17:42] parse_names
[WARN] [2020-10-29 18:17:42] I see 17485 names which still need to be parsed.
[WARN] [2020-10-29 18:17:55] I see 81 names which still need to be parsed.
[WARN] [2020-10-29 18:17:56] I see 8 names which still need to be parsed.
[WARN] [2020-10-29 18:17:57] I see 2 names which still need to be parsed.
[STOP] [2020-10-29 18:17:59] parse_names
[START] [2020-10-29 18:17:59] denormalize_canonical_names_to_nodes
[STOP] [2020-10-29 18:17:59] denormalize_canonical_names_to_nodes
[START] [2020-10-29 18:17:59] match_nodes
[START] [2020-10-29 18:17:59] map_all_nodes_to_pages
[STOP] [2020-10-29 18:44:01] map_all_nodes_to_pages
[INFO] [2020-10-29 18:44:01] 513 Unmatched nodes (of 17485)! That's too many to output. First 10: Vicia ervilla (#80834581); Acacia (#80835159); Acacia baileyana × Acacia leucoclada (#80835349); Desmanthus tortuosum (#80836737); Cytisus (#80841113); Desmanthus discolor (#80844182); Canavalia virosa (#80845225); Lespedeza liukiuensis (#80846396); Desmanthus uncinatum (#80847031); Desmanthus intortum (#80847180)
[START] [2020-10-29 18:44:01] update_nodes
[STOP] [2020-10-29 18:44:07] update_nodes
[STOP] [2020-10-29 18:44:07] match_nodes
[START] [2020-10-29 18:44:07] reindex_search
[STOP] [2020-10-29 18:44:31] reindex_search
[START] [2020-10-29 18:44:31] normalize_units
[STOP] [2020-10-29 18:44:31] normalize_units
[START] [2020-10-29 18:44:31] calculate_statistics
[STOP] [2020-10-29 18:44:32] calculate_statistics
[START] [2020-10-29 18:44:32] complete_harvest_instance
[START] [2020-10-29 18:44:32] overall_tsv_creation
[INFO] [2020-10-29 18:44:32] Processing group of 17485 in 2 batches of 10000
[INFO] [2020-10-29 18:45:45] 46403 Traits (unfiltered)...
[INFO] [2020-10-29 18:49:42] 46403 Traits (filtered)...
[INFO] [2020-10-29 18:49:46] 0 Associations (filtered)...
[INFO] [2020-10-29 18:49:57] 93048 metadata added.
[INFO] [2020-10-29 18:49:57] 0 metadata added.
[INFO] [2020-10-29 18:51:15] 39096 Traits (unfiltered)...
[INFO] [2020-10-29 18:54:44] 39096 Traits (filtered)...
[INFO] [2020-10-29 18:54:47] 0 Associations (filtered)...
[INFO] [2020-10-29 18:54:57] 78431 metadata added.
[INFO] [2020-10-29 18:54:57] 0 metadata added.
[INFO] [2020-10-29 18:55:02] Average Time: 283.56
[INFO] [2020-10-29 18:55:02] Total Time: 10m31s
[STOP] [2020-10-29 18:55:02] overall_tsv_creation
[INFO] [2020-10-29 18:55:02] Done. Check your files:
[INFO] [2020-10-29 18:55:03] (17407 lines) /app/public/data/griis/publish_nodes.tsv
[INFO] [2020-10-29 18:55:03] (83176 lines) /app/public/data/griis/publish_node_ancestors.tsv
[INFO] [2020-10-29 18:55:03] (17485 lines) /app/public/data/griis/publish_scientific_names.tsv
[INFO] [2020-10-29 18:55:03] (85500 lines) /app/public/data/griis/publish_traits.tsv
[INFO] [2020-10-29 18:55:03] (171480 lines) /app/public/data/griis/publish_metadata.tsv
[STOP] [2020-10-29 18:55:03] complete_harvest_instance
[START] [2020-10-29 18:55:03] completed
[STOP] [2020-10-29 18:55:03] completed
[STOP] [2020-10-29 18:55:03] logged process, took 2418.65
[INFO] [2020-11-03 14:39:18] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-11-03 14:39:22] ## remove_type: ScientificName
[INFO] [2020-11-03 14:39:22] ++ Calling delete_all on 17485 instances...
[INFO] [2020-11-03 14:39:24] [14:39:24.431] Removed 17485 Scientificnames
[INFO] [2020-11-03 14:39:24] ## remove_type: Vernacular
[INFO] [2020-11-03 14:39:24] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-03 14:39:24] [14:39:24.435] Removed 0 Vernaculars
[INFO] [2020-11-03 14:39:24] ## remove_type: Article
[INFO] [2020-11-03 14:39:24] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-03 14:39:24] [14:39:24.438] Removed 0 Articles
[INFO] [2020-11-03 14:39:24] ## remove_type: Medium
[INFO] [2020-11-03 14:39:24] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-03 14:39:24] [14:39:24.442] Removed 0 Media
[INFO] [2020-11-03 14:39:24] ## remove_type: Trait
[INFO] [2020-11-03 14:39:24] ++ Calling delete_all on 85499 instances...
[INFO] [2020-11-03 14:39:41] [14:39:41.485] Removed 85499 Traits
[INFO] [2020-11-03 14:39:41] ## remove_type: MetaTrait
[INFO] [2020-11-03 14:39:41] ++ Calling delete_all on 85499 instances...
[INFO] [2020-11-03 14:40:02] [14:40:02.599] Removed 85499 Metatraits
[INFO] [2020-11-03 14:40:02] ## remove_type: OccurrenceMetadatum
[INFO] [2020-11-03 14:40:02] ++ Calling delete_all on 85980 instances...
[INFO] [2020-11-03 14:40:09] [14:40:09.410] Removed 85980 Occurrencemetadata
[INFO] [2020-11-03 14:40:09] ## remove_type: Assoc
[INFO] [2020-11-03 14:40:09] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-03 14:40:09] [14:40:09.413] Removed 0 Assocs
[INFO] [2020-11-03 14:40:09] ## remove_type: MetaAssoc
[INFO] [2020-11-03 14:40:09] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-03 14:40:09] [14:40:09.416] Removed 0 Metaassocs
[INFO] [2020-11-03 14:40:09] ## remove_type: Identifier
[INFO] [2020-11-03 14:40:09] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-03 14:40:09] [14:40:09.418] Removed 0 Identifiers
[INFO] [2020-11-03 14:40:09] ## remove_type: Reference
[INFO] [2020-11-03 14:40:09] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-03 14:40:09] [14:40:09.421] Removed 0 References
[INFO] [2020-11-03 14:40:10] Starting batch with ID 80845423...
[INFO] [2020-11-03 14:40:11] Starting batch with ID 80849123...
[INFO] [2020-11-03 14:40:12] Starting batch with ID 80849123...
[INFO] [2020-11-03 14:40:12] Starting batch with ID 80840593...
[INFO] [2020-11-03 14:40:13] Starting batch with ID 80840593...
[INFO] [2020-11-03 14:40:14] Starting batch with ID 80845508...
[INFO] [2020-11-03 14:40:14] Starting batch with ID 80845508...
[INFO] [2020-11-03 14:40:14] Starting batch with ID 80845508...
[INFO] [2020-11-03 14:40:15] ## remove_type: Node
[INFO] [2020-11-03 14:40:15] ++ Calling delete_all on 17485 instances...
[INFO] [2020-11-03 14:40:16] [14:40:16.865] Removed 17485 Nodes
[START] [2020-11-03 14:40:28] logged process
[START] [2020-11-03 14:40:29] Creating resource from OpenData
[START] [2020-11-03 14:40:30] logged process
[START] [2020-11-03 14:40:30] Parse meta.xml file and create formats with fields
[STOP] [2020-11-03 14:40:30] Parse meta.xml file and create formats with fields
[STOP] [2020-11-03 14:40:30] Creating resource from OpenData
[START] [2020-11-03 14:40:30] logged process
[START] [2020-11-03 14:40:30] create_harvest_instance
[STOP] [2020-11-03 14:40:33] create_harvest_instance
[START] [2020-11-03 14:40:34] fetch_files
[STOP] [2020-11-03 14:40:34] fetch_files
[START] [2020-11-03 14:40:34] validate_each_file
[STOP] [2020-11-03 14:40:38] validate_each_file
[START] [2020-11-03 14:40:38] convert_to_csv
[CMD] [2020-11-03 14:40:38] /usr/bin/sort /app/public/converted_csv/griis_nodes_23152.csv > /app/public/converted_csv/griis_nodes_23152.csv_sorted
[CMD] [2020-11-03 14:40:38] /usr/bin/sort /app/public/converted_csv/griis_occurrences_23153.csv > /app/public/converted_csv/griis_occurrences_23153.csv_sorted
[CMD] [2020-11-03 14:40:38] /usr/bin/sort /app/public/converted_csv/griis_measurements_23154.csv > /app/public/converted_csv/griis_measurements_23154.csv_sorted
[STOP] [2020-11-03 14:40:38] convert_to_csv
[START] [2020-11-03 14:40:38] calculate_delta
[CMD] [2020-11-03 14:40:38] echo "0a" > /app/public/diff/griis_nodes_23152.diff
[CMD] [2020-11-03 14:40:38] tail -n +1 /app/public/converted_csv/griis_nodes_23152.csv >> /app/public/diff/griis_nodes_23152.diff
[CMD] [2020-11-03 14:40:38] echo "." >> /app/public/diff/griis_nodes_23152.diff
[CMD] [2020-11-03 14:40:38] echo "0a" > /app/public/diff/griis_occurrences_23153.diff
[CMD] [2020-11-03 14:40:38] tail -n +1 /app/public/converted_csv/griis_occurrences_23153.csv >> /app/public/diff/griis_occurrences_23153.diff
[CMD] [2020-11-03 14:40:38] echo "." >> /app/public/diff/griis_occurrences_23153.diff
[CMD] [2020-11-03 14:40:38] echo "0a" > /app/public/diff/griis_measurements_23154.diff
[CMD] [2020-11-03 14:40:38] tail -n +1 /app/public/converted_csv/griis_measurements_23154.csv >> /app/public/diff/griis_measurements_23154.diff
[CMD] [2020-11-03 14:40:38] echo "." >> /app/public/diff/griis_measurements_23154.diff
[STOP] [2020-11-03 14:40:38] calculate_delta
[START] [2020-11-03 14:40:38] parse_diff_and_store
[INFO] [2020-11-03 14:40:38] Loading nodes diff file into memory (true lines)...
[WARN] [2020-11-03 14:40:39] New Taxonomic status: species; treatings as unusable...
[WARN] [2020-11-03 14:40:39] Filtered Scientific Name `Circenita varia  (Born, 1778)` to `Circenita varia (Born, 1778)`
[WARN] [2020-11-03 14:40:39] New Taxonomic status: variety; treatings as unusable...
[WARN] [2020-11-03 14:40:39] Filtered Scientific Name `Dialeurodes citri  (Ashmed, 1885)` to `Dialeurodes citri (Ashmed, 1885)`
[WARN] [2020-11-03 14:40:39] Filtered Scientific Name `Belucia acinanthera Triana  Synonim: Bellucia pentamera Naudin Bellucia costariensis Cogn.` to `Belucia acinanthera Triana Synonim: Bellucia pentamera Naudin Bellucia costariensis Cogn.`
[WARN] [2020-11-03 14:40:41] Filtered Scientific Name `Priacanthus sagittarius  Starnes, 1988` to `Priacanthus sagittarius Starnes, 1988`
[WARN] [2020-11-03 14:40:42] Filtered Scientific Name `Egeria dense  Planch.` to `Egeria dense Planch.`
[WARN] [2020-11-03 14:40:42] Filtered Scientific Name `Glyphodes pyloalis  (Walker,1859)` to `Glyphodes pyloalis (Walker,1859)`
[WARN] [2020-11-03 14:40:43] Filtered Scientific Name `Alpheus edwardsii  (Audouin, 1826)` to `Alpheus edwardsii (Audouin, 1826)`
[WARN] [2020-11-03 14:40:43] Filtered Scientific Name `Allotropa  burrelli Muesebeck, 1942` to `Allotropa burrelli Muesebeck, 1942`
[WARN] [2020-11-03 14:40:43] Filtered Scientific Name `Allotropa  convexifrons Muesebeck, 1943` to `Allotropa convexifrons Muesebeck, 1943`
[INFO] [2020-11-03 14:40:43] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-11-03 14:40:58] Loading measurements diff file into memory (true lines)...
[STOP] [2020-11-03 14:40:59] parse_diff_and_store
[ERR] [2020-11-03 14:40:59] RuntimeError
[ERR] [2020-11-03 14:40:59] Missing Term for URI `http://www.geonames.org/1036973`, must be added!
[ERR] [2020-11-03 14:40:59] ../models/store/model_builder.rb:539:in `fail_on_bad_uri'
[ERR] [2020-11-03 14:40:59] ../models/store/model_builder.rb:501:in `convert_trait_value'
[ERR] [2020-11-03 14:40:59] ../models/store/model_builder.rb:359:in `build_trait'
[ERR] [2020-11-03 14:40:59] ../models/store/model_builder.rb:29:in `build_models'
[ERR] [2020-11-03 14:40:59] ../models/resource_harvester.rb:339:in `block (3 levels) in parse_diff_and_store'
[ERR] [2020-11-03 14:40:59] ../models/csv_parser.rb:97:in `block in diff_as_hashes'
[ERR] [2020-11-03 14:40:59] ../models/csv_parser.rb:26:in `block in line_at_a_time'
[ERR] [2020-11-03 14:40:59] ../models/csv_parser.rb:24:in `line_at_a_time'
[ERR] [2020-11-03 14:40:59] ../models/csv_parser.rb:89:in `diff_as_hashes'
[ERR] [2020-11-03 14:40:59] ../models/resource_harvester.rb:308:in `block (2 levels) in parse_diff_and_store'
[ERR] [2020-11-03 14:40:59] ../models/logged_process.rb:62:in `enter_group'
[ERR] [2020-11-03 14:40:59] ../models/resource_harvester.rb:307:in `block in parse_diff_and_store'
[ERR] [2020-11-03 14:40:59] ../models/resource_harvester.rb:682:in `block in each_diff'
[ERR] [2020-11-03 14:40:59] ../models/resource_harvester.rb:671:in `each_diff'
[ERR] [2020-11-03 14:40:59] ../models/resource_harvester.rb:302:in `parse_diff_and_store'
[ERR] [2020-11-03 14:40:59] ../models/resource_harvester.rb:85:in `block (3 levels) in start'
[ERR] [2020-11-03 14:40:59] ../models/logged_process.rb:19:in `run_step'
[ERR] [2020-11-03 14:40:59] ../models/resource_harvester.rb:85:in `block (2 levels) in start'
[ERR] [2020-11-03 14:40:59] ../models/resource_harvester.rb:74:in `each_key'
[ERR] [2020-11-03 14:40:59] ../models/resource_harvester.rb:74:in `block in start'
[ERR] [2020-11-03 14:40:59] ../models/resource.rb:151:in `lock'
[ERR] [2020-11-03 14:40:59] ../models/resource_harvester.rb:72:in `start'
[ERR] [2020-11-03 14:40:59] ../models/resource.rb:232:in `harvest'
[ERR] [2020-11-03 14:40:59] ../models/resource.rb:208:in `re_download_opendata_and_harvest'
[ERR] [2020-11-03 14:40:59] bin/rails:4:in `require'
[ERR] [2020-11-03 14:40:59] bin/rails:4:in `<main>'
[STOP] [2020-11-03 14:40:59] logged process, took 28.86
[INFO] [2020-11-09 13:25:06] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-11-09 13:25:07] ## remove_type: ScientificName
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.436] Removed 0 Scientificnames
[INFO] [2020-11-09 13:25:07] ## remove_type: Vernacular
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.439] Removed 0 Vernaculars
[INFO] [2020-11-09 13:25:07] ## remove_type: Article
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.442] Removed 0 Articles
[INFO] [2020-11-09 13:25:07] ## remove_type: Medium
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.445] Removed 0 Media
[INFO] [2020-11-09 13:25:07] ## remove_type: Trait
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.449] Removed 0 Traits
[INFO] [2020-11-09 13:25:07] ## remove_type: MetaTrait
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.452] Removed 0 Metatraits
[INFO] [2020-11-09 13:25:07] ## remove_type: OccurrenceMetadatum
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.455] Removed 0 Occurrencemetadata
[INFO] [2020-11-09 13:25:07] ## remove_type: Assoc
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.458] Removed 0 Assocs
[INFO] [2020-11-09 13:25:07] ## remove_type: MetaAssoc
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.461] Removed 0 Metaassocs
[INFO] [2020-11-09 13:25:07] ## remove_type: Identifier
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.463] Removed 0 Identifiers
[INFO] [2020-11-09 13:25:07] ## remove_type: Reference
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.467] Removed 0 References
[INFO] [2020-11-09 13:25:07] ## remove_type: Node
[INFO] [2020-11-09 13:25:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-09 13:25:07] [13:25:07.897] Removed 0 Nodes
[START] [2020-11-09 13:25:08] logged process: 9719e9b2443e9b331c213f6700b6e76a1b7c8548

[START] [2020-11-09 13:25:08] Creating resource from OpenData
[START] [2020-11-09 13:25:09] logged process: 9719e9b2443e9b331c213f6700b6e76a1b7c8548

[START] [2020-11-09 13:25:09] Parse meta.xml file and create formats with fields
[STOP] [2020-11-09 13:25:09] Parse meta.xml file and create formats with fields
[STOP] [2020-11-09 13:25:09] Creating resource from OpenData
[START] [2020-11-09 13:25:09] logged process: 9719e9b2443e9b331c213f6700b6e76a1b7c8548

[START] [2020-11-09 13:25:09] create_harvest_instance
[STOP] [2020-11-09 13:25:11] create_harvest_instance
[START] [2020-11-09 13:25:11] fetch_files
[STOP] [2020-11-09 13:25:11] fetch_files
[START] [2020-11-09 13:25:11] validate_each_file
[STOP] [2020-11-09 13:25:15] validate_each_file
[START] [2020-11-09 13:25:15] convert_to_csv
[CMD] [2020-11-09 13:25:15] /usr/bin/sort /app/public/converted_csv/griis_nodes_23771.csv > /app/public/converted_csv/griis_nodes_23771.csv_sorted
[CMD] [2020-11-09 13:25:15] /usr/bin/sort /app/public/converted_csv/griis_occurrences_23772.csv > /app/public/converted_csv/griis_occurrences_23772.csv_sorted
[CMD] [2020-11-09 13:25:15] /usr/bin/sort /app/public/converted_csv/griis_measurements_23773.csv > /app/public/converted_csv/griis_measurements_23773.csv_sorted
[STOP] [2020-11-09 13:25:15] convert_to_csv
[START] [2020-11-09 13:25:15] calculate_delta
[CMD] [2020-11-09 13:25:15] echo "0a" > /app/public/diff/griis_nodes_23771.diff
[CMD] [2020-11-09 13:25:15] tail -n +1 /app/public/converted_csv/griis_nodes_23771.csv >> /app/public/diff/griis_nodes_23771.diff
[CMD] [2020-11-09 13:25:15] echo "." >> /app/public/diff/griis_nodes_23771.diff
[CMD] [2020-11-09 13:25:15] echo "0a" > /app/public/diff/griis_occurrences_23772.diff
[CMD] [2020-11-09 13:25:15] tail -n +1 /app/public/converted_csv/griis_occurrences_23772.csv >> /app/public/diff/griis_occurrences_23772.diff
[CMD] [2020-11-09 13:25:15] echo "." >> /app/public/diff/griis_occurrences_23772.diff
[CMD] [2020-11-09 13:25:15] echo "0a" > /app/public/diff/griis_measurements_23773.diff
[CMD] [2020-11-09 13:25:15] tail -n +1 /app/public/converted_csv/griis_measurements_23773.csv >> /app/public/diff/griis_measurements_23773.diff
[CMD] [2020-11-09 13:25:15] echo "." >> /app/public/diff/griis_measurements_23773.diff
[STOP] [2020-11-09 13:25:15] calculate_delta
[START] [2020-11-09 13:25:15] parse_diff_and_store
[INFO] [2020-11-09 13:25:15] Loading nodes diff file into memory (true lines)...
[WARN] [2020-11-09 13:25:16] New Taxonomic status: species; treatings as unusable...
[WARN] [2020-11-09 13:25:16] Filtered Scientific Name `Circenita varia  (Born, 1778)` to `Circenita varia (Born, 1778)`
[WARN] [2020-11-09 13:25:16] New Taxonomic status: variety; treatings as unusable...
[WARN] [2020-11-09 13:25:16] Filtered Scientific Name `Dialeurodes citri  (Ashmed, 1885)` to `Dialeurodes citri (Ashmed, 1885)`
[WARN] [2020-11-09 13:25:16] Filtered Scientific Name `Belucia acinanthera Triana  Synonim: Bellucia pentamera Naudin Bellucia costariensis Cogn.` to `Belucia acinanthera Triana Synonim: Bellucia pentamera Naudin Bellucia costariensis Cogn.`
[WARN] [2020-11-09 13:25:18] Filtered Scientific Name `Priacanthus sagittarius  Starnes, 1988` to `Priacanthus sagittarius Starnes, 1988`
[WARN] [2020-11-09 13:25:19] Filtered Scientific Name `Egeria dense  Planch.` to `Egeria dense Planch.`
[WARN] [2020-11-09 13:25:20] Filtered Scientific Name `Glyphodes pyloalis  (Walker,1859)` to `Glyphodes pyloalis (Walker,1859)`
[WARN] [2020-11-09 13:25:20] Filtered Scientific Name `Alpheus edwardsii  (Audouin, 1826)` to `Alpheus edwardsii (Audouin, 1826)`
[WARN] [2020-11-09 13:25:20] Filtered Scientific Name `Allotropa  burrelli Muesebeck, 1942` to `Allotropa burrelli Muesebeck, 1942`
[WARN] [2020-11-09 13:25:20] Filtered Scientific Name `Allotropa  convexifrons Muesebeck, 1943` to `Allotropa convexifrons Muesebeck, 1943`
[INFO] [2020-11-09 13:25:21] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-11-09 13:25:35] Loading measurements diff file into memory (true lines)...
[STOP] [2020-11-09 13:25:37] parse_diff_and_store
[ERR] [2020-11-09 13:25:37] RuntimeError
[ERR] [2020-11-09 13:25:37] Missing Term for URI `http://www.geonames.org/1036973`, must be added!
[ERR] [2020-11-09 13:25:37] ../models/store/model_builder.rb:635:in `fail_on_bad_uri'
[ERR] [2020-11-09 13:25:37] ../models/store/model_builder.rb:589:in `convert_trait_value'
[ERR] [2020-11-09 13:25:37] ../models/store/model_builder.rb:415:in `build_trait'
[ERR] [2020-11-09 13:25:37] ../models/store/model_builder.rb:28:in `build_models'
[ERR] [2020-11-09 13:25:37] ../models/resource_harvester.rb:357:in `block (3 levels) in parse_diff_and_store'
[ERR] [2020-11-09 13:25:37] ../models/csv_parser.rb:111:in `block in diff_as_hashes'
[ERR] [2020-11-09 13:25:37] ../models/csv_parser.rb:28:in `block in line_at_a_time'
[ERR] [2020-11-09 13:25:37] ../models/csv_parser.rb:25:in `line_at_a_time'
[ERR] [2020-11-09 13:25:37] ../models/csv_parser.rb:96:in `diff_as_hashes'
[ERR] [2020-11-09 13:25:37] ../models/resource_harvester.rb:313:in `block (2 levels) in parse_diff_and_store'
[ERR] [2020-11-09 13:25:37] ../models/logged_process.rb:62:in `enter_group'
[ERR] [2020-11-09 13:25:37] ../models/resource_harvester.rb:312:in `block in parse_diff_and_store'
[ERR] [2020-11-09 13:25:37] ../models/resource_harvester.rb:700:in `block in each_diff'
[ERR] [2020-11-09 13:25:37] ../models/resource_harvester.rb:689:in `each_diff'
[ERR] [2020-11-09 13:25:37] ../models/resource_harvester.rb:307:in `parse_diff_and_store'
[ERR] [2020-11-09 13:25:37] ../models/resource_harvester.rb:85:in `block (3 levels) in start'
[ERR] [2020-11-09 13:25:37] ../models/logged_process.rb:19:in `run_step'
[ERR] [2020-11-09 13:25:37] ../models/resource_harvester.rb:85:in `block (2 levels) in start'
[ERR] [2020-11-09 13:25:37] ../models/resource_harvester.rb:74:in `each_key'
[ERR] [2020-11-09 13:25:37] ../models/resource_harvester.rb:74:in `block in start'
[ERR] [2020-11-09 13:25:37] ../models/resource.rb:151:in `lock'
[ERR] [2020-11-09 13:25:37] ../models/resource_harvester.rb:72:in `start'
[ERR] [2020-11-09 13:25:37] ../models/resource.rb:232:in `harvest'
[ERR] [2020-11-09 13:25:37] ../models/resource.rb:208:in `re_download_opendata_and_harvest'
[ERR] [2020-11-09 13:25:37] bin/rails:4:in `require'
[ERR] [2020-11-09 13:25:37] bin/rails:4:in `<main>'
[STOP] [2020-11-09 13:25:37] logged process, took 27.39
[INFO] [2020-11-10 15:28:15] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-11-10 15:28:16] ## remove_type: ScientificName
[INFO] [2020-11-10 15:28:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:16] [15:28:16.984] Removed 0 Scientificnames
[INFO] [2020-11-10 15:28:16] ## remove_type: Vernacular
[INFO] [2020-11-10 15:28:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:16] [15:28:16.987] Removed 0 Vernaculars
[INFO] [2020-11-10 15:28:16] ## remove_type: Article
[INFO] [2020-11-10 15:28:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:16] [15:28:16.990] Removed 0 Articles
[INFO] [2020-11-10 15:28:16] ## remove_type: Medium
[INFO] [2020-11-10 15:28:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:16] [15:28:16.994] Removed 0 Media
[INFO] [2020-11-10 15:28:16] ## remove_type: Trait
[INFO] [2020-11-10 15:28:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:16] [15:28:16.998] Removed 0 Traits
[INFO] [2020-11-10 15:28:16] ## remove_type: MetaTrait
[INFO] [2020-11-10 15:28:17] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:17] [15:28:17.000] Removed 0 Metatraits
[INFO] [2020-11-10 15:28:17] ## remove_type: OccurrenceMetadatum
[INFO] [2020-11-10 15:28:17] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:17] [15:28:17.003] Removed 0 Occurrencemetadata
[INFO] [2020-11-10 15:28:17] ## remove_type: Assoc
[INFO] [2020-11-10 15:28:17] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:17] [15:28:17.006] Removed 0 Assocs
[INFO] [2020-11-10 15:28:17] ## remove_type: MetaAssoc
[INFO] [2020-11-10 15:28:17] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:17] [15:28:17.009] Removed 0 Metaassocs
[INFO] [2020-11-10 15:28:17] ## remove_type: Identifier
[INFO] [2020-11-10 15:28:17] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:17] [15:28:17.012] Removed 0 Identifiers
[INFO] [2020-11-10 15:28:17] ## remove_type: Reference
[INFO] [2020-11-10 15:28:17] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:17] [15:28:17.015] Removed 0 References
[INFO] [2020-11-10 15:28:17] ## remove_type: Node
[INFO] [2020-11-10 15:28:17] ++ Calling delete_all on 0 instances...
[INFO] [2020-11-10 15:28:17] [15:28:17.034] Removed 0 Nodes
[START] [2020-11-10 15:28:17] logged process: 9719e9b2443e9b331c213f6700b6e76a1b7c8548

[START] [2020-11-10 15:28:17] Creating resource from OpenData
[START] [2020-11-10 15:28:18] logged process: 9719e9b2443e9b331c213f6700b6e76a1b7c8548

[START] [2020-11-10 15:28:18] Parse meta.xml file and create formats with fields
[STOP] [2020-11-10 15:28:18] Parse meta.xml file and create formats with fields
[STOP] [2020-11-10 15:28:18] Creating resource from OpenData
[START] [2020-11-10 15:28:18] logged process: 9719e9b2443e9b331c213f6700b6e76a1b7c8548

[START] [2020-11-10 15:28:18] create_harvest_instance
[STOP] [2020-11-10 15:28:19] create_harvest_instance
[START] [2020-11-10 15:28:19] fetch_files
[STOP] [2020-11-10 15:28:19] fetch_files
[START] [2020-11-10 15:28:19] validate_each_file
[STOP] [2020-11-10 15:28:24] validate_each_file
[START] [2020-11-10 15:28:24] convert_to_csv
[CMD] [2020-11-10 15:28:24] /usr/bin/sort /app/public/converted_csv/griis_nodes_24013.csv > /app/public/converted_csv/griis_nodes_24013.csv_sorted
[CMD] [2020-11-10 15:28:24] /usr/bin/sort /app/public/converted_csv/griis_occurrences_24014.csv > /app/public/converted_csv/griis_occurrences_24014.csv_sorted
[CMD] [2020-11-10 15:28:24] /usr/bin/sort /app/public/converted_csv/griis_measurements_24015.csv > /app/public/converted_csv/griis_measurements_24015.csv_sorted
[STOP] [2020-11-10 15:28:24] convert_to_csv
[START] [2020-11-10 15:28:24] calculate_delta
[CMD] [2020-11-10 15:28:24] echo "0a" > /app/public/diff/griis_nodes_24013.diff
[CMD] [2020-11-10 15:28:24] tail -n +1 /app/public/converted_csv/griis_nodes_24013.csv >> /app/public/diff/griis_nodes_24013.diff
[CMD] [2020-11-10 15:28:24] echo "." >> /app/public/diff/griis_nodes_24013.diff
[CMD] [2020-11-10 15:28:24] echo "0a" > /app/public/diff/griis_occurrences_24014.diff
[CMD] [2020-11-10 15:28:24] tail -n +1 /app/public/converted_csv/griis_occurrences_24014.csv >> /app/public/diff/griis_occurrences_24014.diff
[CMD] [2020-11-10 15:28:24] echo "." >> /app/public/diff/griis_occurrences_24014.diff
[CMD] [2020-11-10 15:28:24] echo "0a" > /app/public/diff/griis_measurements_24015.diff
[CMD] [2020-11-10 15:28:24] tail -n +1 /app/public/converted_csv/griis_measurements_24015.csv >> /app/public/diff/griis_measurements_24015.diff
[CMD] [2020-11-10 15:28:24] echo "." >> /app/public/diff/griis_measurements_24015.diff
[STOP] [2020-11-10 15:28:24] calculate_delta
[START] [2020-11-10 15:28:24] parse_diff_and_store
[INFO] [2020-11-10 15:28:24] Loading nodes diff file into memory (true lines)...
[WARN] [2020-11-10 15:28:24] New Taxonomic status: species; treatings as unusable...
[WARN] [2020-11-10 15:28:25] Filtered Scientific Name `Circenita varia  (Born, 1778)` to `Circenita varia (Born, 1778)`
[WARN] [2020-11-10 15:28:25] New Taxonomic status: variety; treatings as unusable...
[WARN] [2020-11-10 15:28:25] Filtered Scientific Name `Dialeurodes citri  (Ashmed, 1885)` to `Dialeurodes citri (Ashmed, 1885)`
[WARN] [2020-11-10 15:28:25] Filtered Scientific Name `Belucia acinanthera Triana  Synonim: Bellucia pentamera Naudin Bellucia costariensis Cogn.` to `Belucia acinanthera Triana Synonim: Bellucia pentamera Naudin Bellucia costariensis Cogn.`
[WARN] [2020-11-10 15:28:27] Filtered Scientific Name `Priacanthus sagittarius  Starnes, 1988` to `Priacanthus sagittarius Starnes, 1988`
[WARN] [2020-11-10 15:28:28] Filtered Scientific Name `Egeria dense  Planch.` to `Egeria dense Planch.`
[WARN] [2020-11-10 15:28:28] Filtered Scientific Name `Glyphodes pyloalis  (Walker,1859)` to `Glyphodes pyloalis (Walker,1859)`
[WARN] [2020-11-10 15:28:28] Filtered Scientific Name `Alpheus edwardsii  (Audouin, 1826)` to `Alpheus edwardsii (Audouin, 1826)`
[WARN] [2020-11-10 15:28:29] Filtered Scientific Name `Allotropa  burrelli Muesebeck, 1942` to `Allotropa burrelli Muesebeck, 1942`
[WARN] [2020-11-10 15:28:29] Filtered Scientific Name `Allotropa  convexifrons Muesebeck, 1943` to `Allotropa convexifrons Muesebeck, 1943`
[INFO] [2020-11-10 15:28:29] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-11-10 15:28:43] Loading measurements diff file into memory (true lines)...
[INFO] [2020-11-10 15:29:16] Storing 17485 ScientificNames
[INFO] [2020-11-10 15:29:16] Processing group of 17485 in 18 groups of 1000
[INFO] [2020-11-10 15:29:25] Average Time: 0.489
[INFO] [2020-11-10 15:29:25] Total Time: 9s
[INFO] [2020-11-10 15:29:25] last 3 / first 3: 0.84
[INFO] [2020-11-10 15:29:25] Std.Dev: 0.09486832980505137; Max: 0.67
[INFO] [2020-11-10 15:29:25] Storing 17485 Nodes
[INFO] [2020-11-10 15:29:25] Processing group of 17485 in 18 groups of 1000
[INFO] [2020-11-10 15:29:33] Average Time: 0.456
[INFO] [2020-11-10 15:29:33] Total Time: 9s
[INFO] [2020-11-10 15:29:33] last 3 / first 3: 0.63
[INFO] [2020-11-10 15:29:33] Std.Dev: 0.10488088481701516; Max: 0.58
[INFO] [2020-11-10 15:29:33] Storing 57655 Occurrences
[INFO] [2020-11-10 15:29:33] Processing group of 57655 in 58 groups of 1000
[INFO] [2020-11-10 15:29:43] Average Time: 0.159
[INFO] [2020-11-10 15:29:43] Total Time: 10s
[INFO] [2020-11-10 15:29:43] last 3 / first 3: 0.84
[INFO] [2020-11-10 15:29:43] Std.Dev: 0.17320508075688773; Max: 1.44
[INFO] [2020-11-10 15:29:43] Storing 81953 OccurrenceMetadata
[INFO] [2020-11-10 15:29:43] Processing group of 81953 in 82 groups of 1000
[INFO] [2020-11-10 15:29:53] Average Time: 0.116
[INFO] [2020-11-10 15:29:53] Total Time: 10s
[INFO] [2020-11-10 15:29:53] last 3 / first 3: 1.03
[INFO] [2020-11-10 15:29:53] Std.Dev: 0.0; Max: 0.21
[INFO] [2020-11-10 15:29:53] Storing 85499 Traits
[INFO] [2020-11-10 15:29:53] Processing group of 85499 in 86 groups of 1000
[INFO] [2020-11-10 15:30:25] Average Time: 0.373
[INFO] [2020-11-10 15:30:25] Total Time: 33s
[INFO] [2020-11-10 15:30:25] last 3 / first 3: 3.42
[INFO] [2020-11-10 15:30:25] Std.Dev: 0.31144823004794875; Max: 2.64
[INFO] [2020-11-10 15:30:25] Storing 85499 MetaTraits
[INFO] [2020-11-10 15:30:25] Processing group of 85499 in 86 groups of 1000
[INFO] [2020-11-10 15:30:38] Average Time: 0.148
[INFO] [2020-11-10 15:30:38] Total Time: 14s
[INFO] [2020-11-10 15:30:38] last 3 / first 3: 0.64
[INFO] [2020-11-10 15:30:38] Std.Dev: 0.2469817807045694; Max: 2.39
[STOP] [2020-11-10 15:30:38] parse_diff_and_store
[START] [2020-11-10 15:30:38] resolve_keys
[INFO] [2020-11-10 15:31:11] Occurrences to nodes (through scientific_names)...
[INFO] [2020-11-10 15:31:14] traits to occurrences...
[INFO] [2020-11-10 15:31:18] traits to nodes (through occurrences)...
[INFO] [2020-11-10 15:31:20] Traits to sex term...
[INFO] [2020-11-10 15:31:21] Traits to lifestage term...
[INFO] [2020-11-10 15:31:22] MetaTraits to traits...
[INFO] [2020-11-10 15:31:25] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-11-10 15:31:25] Assocs to occurrences...
[INFO] [2020-11-10 15:31:25] Assocs to nodes...
[INFO] [2020-11-10 15:31:25] Assoc to sex term...
[INFO] [2020-11-10 15:31:25] Assoc to lifestage term...
[INFO] [2020-11-10 15:31:25] MetaAssoc to assocs...
[STOP] [2020-11-10 15:31:25] resolve_keys
[START] [2020-11-10 15:31:25] hold_for_later_1
[STOP] [2020-11-10 15:31:25] hold_for_later_1
[START] [2020-11-10 15:31:25] hold_for_later_2
[STOP] [2020-11-10 15:31:25] hold_for_later_2
[START] [2020-11-10 15:31:25] resolve_missing_parents
[STOP] [2020-11-10 15:31:26] resolve_missing_parents
[START] [2020-11-10 15:31:26] rebuild_nodes
[START] [2020-11-10 15:31:26] Flattener#flatten
[START] [2020-11-10 15:31:26] Flattener#study_resource
[START] [2020-11-10 15:31:26] Flattener#build_ancestry
[STOP] [2020-11-10 15:31:29] Flattener#build_ancestry
[INFO] [2020-11-10 15:31:29] 17485 ancestry keys
[START] [2020-11-10 15:31:29] build_node_ancestors
[INFO] [2020-11-10 15:31:29] old ancestors deleted.
[STOP] [2020-11-10 15:31:33] build_node_ancestors
[START] [2020-11-10 15:31:38] Flattener#propagate_ancestor_ids
[STOP] [2020-11-10 15:31:40] Flattener#propagate_ancestor_ids
[STOP] [2020-11-10 15:31:40] Flattener#flatten
[STOP] [2020-11-10 15:31:40] rebuild_nodes
[START] [2020-11-10 15:31:40] resolve_missing_media_owners
[STOP] [2020-11-10 15:31:40] resolve_missing_media_owners
[START] [2020-11-10 15:31:40] sanitize_media_verbatims
[STOP] [2020-11-10 15:31:40] sanitize_media_verbatims
[START] [2020-11-10 15:31:40] queue_downloads
[STOP] [2020-11-10 15:31:40] queue_downloads
[START] [2020-11-10 15:31:40] parse_names
[WARN] [2020-11-10 15:31:40] I see 17485 names which still need to be parsed.
[WARN] [2020-11-10 15:31:56] I see 81 names which still need to be parsed.
[WARN] [2020-11-10 15:31:58] I see 8 names which still need to be parsed.
[WARN] [2020-11-10 15:31:59] I see 2 names which still need to be parsed.
[STOP] [2020-11-10 15:32:00] parse_names
[START] [2020-11-10 15:32:00] denormalize_canonical_names_to_nodes
[STOP] [2020-11-10 15:32:00] denormalize_canonical_names_to_nodes
[START] [2020-11-10 15:32:00] match_nodes
[START] [2020-11-10 15:32:00] map_all_nodes_to_pages
[STOP] [2020-11-10 16:05:18] map_all_nodes_to_pages
[INFO] [2020-11-10 16:05:18] 666 Unmatched nodes (of 17485)! That's too many to output. First 10: Vicia ervilla (#81636753); Acacia (#81637331); Macroptilium atropurpureum (#81637402); Acacia baileyana × Acacia leucoclada (#81637521); Desmanthus tortuosum (#81638909); Aeschynomene schimperi (#81639699); Bolusanthus speciosus (#81639999); Rhynchosia viscosa (#81640369); Tephrosia linearis (#81640502); Crotalaria trichotoma (#81640748)
[START] [2020-11-10 16:05:18] update_nodes
[STOP] [2020-11-10 16:05:27] update_nodes
[STOP] [2020-11-10 16:05:27] match_nodes
[START] [2020-11-10 16:05:27] reindex_search
[STOP] [2020-11-10 16:05:59] reindex_search
[START] [2020-11-10 16:05:59] normalize_units
[STOP] [2020-11-10 16:06:00] normalize_units
[START] [2020-11-10 16:06:00] calculate_statistics
[STOP] [2020-11-10 16:06:01] calculate_statistics
[START] [2020-11-10 16:06:01] complete_harvest_instance
[START] [2020-11-10 16:06:01] overall_tsv_creation
[INFO] [2020-11-10 16:06:01] Processing group of 17485 in 2 batches of 10000
[INFO] [2020-11-10 16:07:14] 46403 Traits (unfiltered)...
[INFO] [2020-11-10 16:10:43] 46403 Traits (filtered)...
[INFO] [2020-11-10 16:10:47] 0 Associations (filtered)...
[INFO] [2020-11-10 16:10:57] 90818 metadata added.
[INFO] [2020-11-10 16:10:57] 0 metadata added.
[INFO] [2020-11-10 16:12:13] 39096 Traits (unfiltered)...
[INFO] [2020-11-10 16:15:18] 39096 Traits (filtered)...
[INFO] [2020-11-10 16:15:22] 0 Associations (filtered)...
[INFO] [2020-11-10 16:15:33] 76634 metadata added.
[INFO] [2020-11-10 16:15:33] 0 metadata added.
[INFO] [2020-11-10 16:15:39] Average Time: 258.24
[INFO] [2020-11-10 16:15:39] Total Time: 9m38s
[STOP] [2020-11-10 16:15:39] overall_tsv_creation
[INFO] [2020-11-10 16:15:39] Done. Check your files:
[INFO] [2020-11-10 16:15:39] (17407 lines) /app/public/data/griis/publish_nodes.tsv
[INFO] [2020-11-10 16:15:39] (83176 lines) /app/public/data/griis/publish_node_ancestors.tsv
[INFO] [2020-11-10 16:15:39] (17485 lines) /app/public/data/griis/publish_scientific_names.tsv
[INFO] [2020-11-10 16:15:39] (85500 lines) /app/public/data/griis/publish_traits.tsv
[INFO] [2020-11-10 16:15:39] (167453 lines) /app/public/data/griis/publish_metadata.tsv
[STOP] [2020-11-10 16:15:39] complete_harvest_instance
[START] [2020-11-10 16:15:39] completed
[STOP] [2020-11-10 16:15:39] completed
[STOP] [2020-11-10 16:15:39] logged process, took 2841.56
[INFO] [2021-04-04 13:06:12] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2021-04-04 13:06:14] ## remove_type: ScientificName
[INFO] [2021-04-04 13:06:14] ++ Calling delete_all on 17485 instances...
[INFO] [2021-04-04 13:06:15] [13:06:15.950] Removed 17485 Scientificnames
[INFO] [2021-04-04 13:06:15] ## remove_type: Vernacular
[INFO] [2021-04-04 13:06:15] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-04 13:06:15] [13:06:15.952] Removed 0 Vernaculars
[INFO] [2021-04-04 13:06:15] ## remove_type: Article
[INFO] [2021-04-04 13:06:15] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-04 13:06:15] [13:06:15.953] Removed 0 Articles
[INFO] [2021-04-04 13:06:15] ## remove_type: Medium
[INFO] [2021-04-04 13:06:15] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-04 13:06:15] [13:06:15.955] Removed 0 Media
[INFO] [2021-04-04 13:06:15] ## remove_type: Trait
[INFO] [2021-04-04 13:06:15] ++ Calling delete_all on 85499 instances...
[INFO] [2021-04-04 13:06:44] [13:06:44.250] Removed 85499 Traits
[INFO] [2021-04-04 13:06:44] ## remove_type: MetaTrait
[INFO] [2021-04-04 13:06:44] ++ Calling delete_all on 85499 instances...
[INFO] [2021-04-04 13:06:49] [13:06:49.083] Removed 85499 Metatraits
[INFO] [2021-04-04 13:06:49] ## remove_type: OccurrenceMetadatum
[INFO] [2021-04-04 13:06:49] ++ Calling delete_all on 81953 instances...
[INFO] [2021-04-04 13:06:57] [13:06:57.465] Removed 81953 Occurrencemetadata
[INFO] [2021-04-04 13:06:57] ## remove_type: Assoc
[INFO] [2021-04-04 13:06:57] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-04 13:06:57] [13:06:57.466] Removed 0 Assocs
[INFO] [2021-04-04 13:06:57] ## remove_type: MetaAssoc
[INFO] [2021-04-04 13:06:57] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-04 13:06:57] [13:06:57.468] Removed 0 Metaassocs
[INFO] [2021-04-04 13:06:57] ## remove_type: Identifier
[INFO] [2021-04-04 13:06:57] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-04 13:06:57] [13:06:57.469] Removed 0 Identifiers
[INFO] [2021-04-04 13:06:57] ## remove_type: Reference
[INFO] [2021-04-04 13:06:57] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-04 13:06:57] [13:06:57.470] Removed 0 References
[INFO] [2021-04-04 13:06:57] Starting batch with ID 81636134...
[INFO] [2021-04-04 13:06:59] Starting batch with ID 81636134...
[INFO] [2021-04-04 13:07:00] Starting batch with ID 81636134...
[INFO] [2021-04-04 13:07:00] Starting batch with ID 81646485...
[INFO] [2021-04-04 13:07:01] Starting batch with ID 81646485...
[INFO] [2021-04-04 13:07:01] Starting batch with ID 81647255...
[INFO] [2021-04-04 13:07:02] Starting batch with ID 81647255...
[INFO] [2021-04-04 13:07:02] Starting batch with ID 81638412...
[INFO] [2021-04-04 13:07:02] Starting batch with ID 81638412...
[INFO] [2021-04-04 13:07:03] Starting batch with ID 81638412...
[INFO] [2021-04-04 13:07:03] ## remove_type: Node
[INFO] [2021-04-04 13:07:03] ++ Calling delete_all on 17485 instances...
[INFO] [2021-04-04 13:07:04] [13:07:04.725] Removed 17485 Nodes
[START] [2021-04-04 13:07:22] logged process: 5ecc716a6a5541910d0c854f5a0c8d1651b82ad0 Improved MetaXml.ignore and added publisher to media (ignored)
[START] [2021-04-04 13:07:22] Creating resource from OpenData
[START] [2021-04-04 13:07:24] logged process: 5ecc716a6a5541910d0c854f5a0c8d1651b82ad0 Improved MetaXml.ignore and added publisher to media (ignored)
[START] [2021-04-04 13:07:24] Parse meta.xml file and create formats with fields
[STOP] [2021-04-04 13:07:34] Parse meta.xml file and create formats with fields
[STOP] [2021-04-04 13:07:34] Creating resource from OpenData
[START] [2021-04-04 13:07:35] logged process: 5ecc716a6a5541910d0c854f5a0c8d1651b82ad0 Improved MetaXml.ignore and added publisher to media (ignored)
[START] [2021-04-04 13:07:35] create_harvest_instance
[INFO] [2021-04-04 13:07:35] Created harvest instance #3637
[STOP] [2021-04-04 13:07:35] create_harvest_instance
[START] [2021-04-04 13:07:35] fetch_files
[STOP] [2021-04-04 13:07:35] fetch_files
[START] [2021-04-04 13:07:35] validate_each_file
[INFO] [2021-04-04 13:07:35] Looping over 3 formats...
[INFO] [2021-04-04 13:07:35] ...nodes (/app/public/data/griis/taxon.tab)
[INFO] [2021-04-04 13:07:35] Valid: /app/public/converted_csv/griis_nodes_3637.csv (14891 lines)
[INFO] [2021-04-04 13:07:35] ...occurrences (/app/public/data/griis/occurrence_specific.tab)
[INFO] [2021-04-04 13:07:37] Valid: /app/public/converted_csv/griis_occurrences_3637.csv (57655 lines)
[INFO] [2021-04-04 13:07:37] ...measurements (/app/public/data/griis/measurement_or_fact_specific.tab)
[INFO] [2021-04-04 13:07:39] Valid: /app/public/converted_csv/griis_measurements_3637.csv (85499 lines)
[STOP] [2021-04-04 13:07:39] validate_each_file
[START] [2021-04-04 13:07:39] convert_to_csv
[INFO] [2021-04-04 13:07:39] Looping over 3 formats...
[INFO] [2021-04-04 13:07:39] ...nodes (/app/public/data/griis/taxon.tab)
[CMD] [2021-04-04 13:07:39] /usr/bin/sort /app/public/converted_csv/griis_nodes_3637.csv > /app/public/converted_csv/griis_nodes_3637.csv_sorted
[INFO] [2021-04-04 13:07:40] Converted: /app/public/converted_csv/griis_nodes_3637.csv (14891 lines)
[INFO] [2021-04-04 13:07:40] ...occurrences (/app/public/data/griis/occurrence_specific.tab)
[CMD] [2021-04-04 13:07:40] /usr/bin/sort /app/public/converted_csv/griis_occurrences_3637.csv > /app/public/converted_csv/griis_occurrences_3637.csv_sorted
[INFO] [2021-04-04 13:07:41] Converted: /app/public/converted_csv/griis_occurrences_3637.csv (57655 lines)
[INFO] [2021-04-04 13:07:41] ...measurements (/app/public/data/griis/measurement_or_fact_specific.tab)
[CMD] [2021-04-04 13:07:41] /usr/bin/sort /app/public/converted_csv/griis_measurements_3637.csv > /app/public/converted_csv/griis_measurements_3637.csv_sorted
[INFO] [2021-04-04 13:07:42] Converted: /app/public/converted_csv/griis_measurements_3637.csv (85499 lines)
[STOP] [2021-04-04 13:07:42] convert_to_csv
[START] [2021-04-04 13:07:42] calculate_delta
[INFO] [2021-04-04 13:07:42] Looping over 3 formats...
[INFO] [2021-04-04 13:07:42] ...nodes (/app/public/data/griis/taxon.tab)
[CMD] [2021-04-04 13:07:42] echo "0a" > /app/public/diff/griis_nodes_3637.diff
[CMD] [2021-04-04 13:07:43] tail -n +1 /app/public/converted_csv/griis_nodes_3637.csv >> /app/public/diff/griis_nodes_3637.diff
[CMD] [2021-04-04 13:07:44] echo "." >> /app/public/diff/griis_nodes_3637.diff
[INFO] [2021-04-04 13:07:45] Created diff: /app/public/diff/griis_nodes_3637.diff (14893 lines)
[INFO] [2021-04-04 13:07:45] ...occurrences (/app/public/data/griis/occurrence_specific.tab)
[CMD] [2021-04-04 13:07:45] echo "0a" > /app/public/diff/griis_occurrences_3637.diff
[CMD] [2021-04-04 13:07:47] tail -n +1 /app/public/converted_csv/griis_occurrences_3637.csv >> /app/public/diff/griis_occurrences_3637.diff
[CMD] [2021-04-04 13:07:48] echo "." >> /app/public/diff/griis_occurrences_3637.diff
[INFO] [2021-04-04 13:07:49] Created diff: /app/public/diff/griis_occurrences_3637.diff (57657 lines)
[INFO] [2021-04-04 13:07:49] ...measurements (/app/public/data/griis/measurement_or_fact_specific.tab)
[CMD] [2021-04-04 13:07:49] echo "0a" > /app/public/diff/griis_measurements_3637.diff
[CMD] [2021-04-04 13:07:50] tail -n +1 /app/public/converted_csv/griis_measurements_3637.csv >> /app/public/diff/griis_measurements_3637.diff
[CMD] [2021-04-04 13:07:51] echo "." >> /app/public/diff/griis_measurements_3637.diff
[INFO] [2021-04-04 13:07:52] Created diff: /app/public/diff/griis_measurements_3637.diff (85501 lines)
[STOP] [2021-04-04 13:07:52] calculate_delta
[START] [2021-04-04 13:07:52] parse_diff_and_store
[INFO] [2021-04-04 13:07:52] Handling diff: /app/public/diff/griis_nodes_3637.diff (14893 lines)
[INFO] [2021-04-04 13:07:53] Loading nodes diff file into memory (14893 /app/public/diff/griis_nodes_3637.diff lines)...
[WARN] [2021-04-04 13:07:54] New Taxonomic status: species; treatings as unusable...
[WARN] [2021-04-04 13:07:54] Filtered Scientific Name `Circenita varia  (Born, 1778)` to `Circenita varia (Born, 1778)`
[WARN] [2021-04-04 13:07:55] New Taxonomic status: variety; treatings as unusable...
[WARN] [2021-04-04 13:07:55] Filtered Scientific Name `Dialeurodes citri  (Ashmed, 1885)` to `Dialeurodes citri (Ashmed, 1885)`
[WARN] [2021-04-04 13:07:55] Filtered Scientific Name `Belucia acinanthera Triana  Synonim: Bellucia pentamera Naudin Bellucia costariensis Cogn.` to `Belucia acinanthera Triana Synonim: Bellucia pentamera Naudin Bellucia costariensis Cogn.`
[WARN] [2021-04-04 13:07:57] Filtered Scientific Name `Priacanthus sagittarius  Starnes, 1988` to `Priacanthus sagittarius Starnes, 1988`
[WARN] [2021-04-04 13:07:57] Filtered Scientific Name `Egeria dense  Planch.` to `Egeria dense Planch.`
[WARN] [2021-04-04 13:07:58] Filtered Scientific Name `Glyphodes pyloalis  (Walker,1859)` to `Glyphodes pyloalis (Walker,1859)`
[WARN] [2021-04-04 13:07:58] Filtered Scientific Name `Alpheus edwardsii  (Audouin, 1826)` to `Alpheus edwardsii (Audouin, 1826)`
[WARN] [2021-04-04 13:07:58] Filtered Scientific Name `Allotropa  burrelli Muesebeck, 1942` to `Allotropa burrelli Muesebeck, 1942`
[WARN] [2021-04-04 13:07:59] Filtered Scientific Name `Allotropa  convexifrons Muesebeck, 1943` to `Allotropa convexifrons Muesebeck, 1943`
[INFO] [2021-04-04 13:07:59] Handling diff: /app/public/diff/griis_occurrences_3637.diff (57657 lines)
[INFO] [2021-04-04 13:08:00] Loading occurrences diff file into memory (57657 /app/public/diff/griis_occurrences_3637.diff lines)...
[INFO] [2021-04-04 13:08:15] Handling diff: /app/public/diff/griis_measurements_3637.diff (85501 lines)
[INFO] [2021-04-04 13:08:16] Loading measurements diff file into memory (85501 /app/public/diff/griis_measurements_3637.diff lines)...
[INFO] [2021-04-04 13:08:51] Storing 17485 ScientificNames
[INFO] [2021-04-04 13:08:51] Processing group of 17485 in 18 groups of 1000
[INFO] [2021-04-04 13:08:56] Average Time: 0.281
[INFO] [2021-04-04 13:08:56] Total Time: 6s
[INFO] [2021-04-04 13:08:56] last 3 / first 3: 0.8
[INFO] [2021-04-04 13:08:56] Std.Dev: 0.044721359549995794; Max: 0.38
[INFO] [2021-04-04 13:08:56] Storing 17485 Nodes
[INFO] [2021-04-04 13:08:56] Processing group of 17485 in 18 groups of 1000
[INFO] [2021-04-04 13:09:01] Average Time: 0.279
[INFO] [2021-04-04 13:09:01] Total Time: 6s
[INFO] [2021-04-04 13:09:01] last 3 / first 3: 0.96
[INFO] [2021-04-04 13:09:01] Std.Dev: 0.044721359549995794; Max: 0.39
[INFO] [2021-04-04 13:09:01] Storing 57655 Occurrences
[INFO] [2021-04-04 13:09:01] Processing group of 57655 in 58 groups of 1000
[INFO] [2021-04-04 13:09:09] Average Time: 0.123
[INFO] [2021-04-04 13:09:09] Total Time: 8s
[INFO] [2021-04-04 13:09:09] last 3 / first 3: 1.27
[INFO] [2021-04-04 13:09:09] Std.Dev: 0.044721359549995794; Max: 0.3
[INFO] [2021-04-04 13:09:09] Storing 81953 OccurrenceMetadata
[INFO] [2021-04-04 13:09:09] Processing group of 81953 in 82 groups of 1000
[INFO] [2021-04-04 13:09:22] Average Time: 0.157
[INFO] [2021-04-04 13:09:22] Total Time: 14s
[INFO] [2021-04-04 13:09:22] last 3 / first 3: 0.97
[INFO] [2021-04-04 13:09:22] Std.Dev: 0.08944271909999159; Max: 0.48
[INFO] [2021-04-04 13:09:22] Storing 85499 Traits
[INFO] [2021-04-04 13:09:22] Processing group of 85499 in 86 groups of 1000
[INFO] [2021-04-04 13:09:47] Average Time: 0.288
[INFO] [2021-04-04 13:09:47] Total Time: 26s
[INFO] [2021-04-04 13:09:47] last 3 / first 3: 0.91
[INFO] [2021-04-04 13:09:47] Std.Dev: 0.044721359549995794; Max: 0.42
[INFO] [2021-04-04 13:09:47] Storing 85499 MetaTraits
[INFO] [2021-04-04 13:09:47] Processing group of 85499 in 86 groups of 1000
[INFO] [2021-04-04 13:10:00] Average Time: 0.148
[INFO] [2021-04-04 13:10:00] Total Time: 13s
[INFO] [2021-04-04 13:10:00] last 3 / first 3: 0.85
[INFO] [2021-04-04 13:10:00] Std.Dev: 0.08944271909999159; Max: 0.47
[STOP] [2021-04-04 13:10:00] parse_diff_and_store
[START] [2021-04-04 13:10:00] resolve_keys
[INFO] [2021-04-04 13:10:11] Occurrences to nodes (through scientific_names)...
[INFO] [2021-04-04 13:10:14] traits to occurrences...
[INFO] [2021-04-04 13:10:17] traits to nodes (through occurrences)...
[INFO] [2021-04-04 13:10:19] Traits to sex term...
[INFO] [2021-04-04 13:10:20] Traits to lifestage term...
[INFO] [2021-04-04 13:10:22] MetaTraits to traits...
[INFO] [2021-04-04 13:10:24] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-04-04 13:10:24] Assocs to occurrences...
[INFO] [2021-04-04 13:10:24] Assocs to nodes...
[INFO] [2021-04-04 13:10:24] Assoc to sex term...
[INFO] [2021-04-04 13:10:24] Assoc to lifestage term...
[INFO] [2021-04-04 13:10:24] MetaAssoc to assocs...
[STOP] [2021-04-04 13:10:24] resolve_keys
[START] [2021-04-04 13:10:24] hold_for_later_1
[STOP] [2021-04-04 13:10:24] hold_for_later_1
[START] [2021-04-04 13:10:24] hold_for_later_2
[STOP] [2021-04-04 13:10:24] hold_for_later_2
[START] [2021-04-04 13:10:24] resolve_missing_parents
[STOP] [2021-04-04 13:10:25] resolve_missing_parents
[START] [2021-04-04 13:10:25] rebuild_nodes
[START] [2021-04-04 13:10:25] Flattener#flatten
[START] [2021-04-04 13:10:25] Flattener#study_resource
[START] [2021-04-04 13:10:25] Flattener#build_ancestry
[STOP] [2021-04-04 13:10:26] Flattener#build_ancestry
[INFO] [2021-04-04 13:10:26] 17485 ancestry keys
[START] [2021-04-04 13:10:26] build_node_ancestors
[INFO] [2021-04-04 13:10:26] old ancestors deleted.
[STOP] [2021-04-04 13:10:29] build_node_ancestors
[START] [2021-04-04 13:10:34] Flattener#propagate_ancestor_ids
[STOP] [2021-04-04 13:10:36] Flattener#propagate_ancestor_ids
[STOP] [2021-04-04 13:10:36] Flattener#flatten
[STOP] [2021-04-04 13:10:36] rebuild_nodes
[START] [2021-04-04 13:10:36] resolve_missing_media_owners
[STOP] [2021-04-04 13:10:36] resolve_missing_media_owners
[START] [2021-04-04 13:10:36] sanitize_media_verbatims
[STOP] [2021-04-04 13:10:36] sanitize_media_verbatims
[START] [2021-04-04 13:10:36] queue_downloads
[STOP] [2021-04-04 13:10:36] queue_downloads
[START] [2021-04-04 13:10:36] parse_names
[WARN] [2021-04-04 13:10:36] I see 17485 names which still need to be parsed.
[WARN] [2021-04-04 13:10:54] I see 81 names which still need to be parsed.
[WARN] [2021-04-04 13:10:56] I see 8 names which still need to be parsed.
[WARN] [2021-04-04 13:10:57] I see 2 names which still need to be parsed.
[STOP] [2021-04-04 13:10:58] parse_names
[START] [2021-04-04 13:10:58] denormalize_canonical_names_to_nodes
[STOP] [2021-04-04 13:10:58] denormalize_canonical_names_to_nodes
[START] [2021-04-04 13:10:58] match_nodes
[START] [2021-04-04 13:10:58] map_all_nodes_to_pages
[STOP] [2021-04-04 13:17:03] map_all_nodes_to_pages
[INFO] [2021-04-04 13:17:03] 761 Unmatched nodes (of 17485)! That's too many to output. Full list in /app/public/data/griis/unmatched_nodes.txt ; First 10: Magnoliopsida (#91696183); Vicia ervilla (#91696834); Acacia (#91697412); Acacia baileyana × Acacia leucoclada (#91697602); Desmanthus tortuosum (#91698990); Cytisus (#91703366); Desmanthus discolor (#91706435); Canavalia virosa (#91707478); Lespedeza liukiuensis (#91708649); Desmanthus uncinatum (#91709284)
[START] [2021-04-04 13:17:03] update_nodes
[STOP] [2021-04-04 13:17:10] update_nodes
[STOP] [2021-04-04 13:17:10] match_nodes
[START] [2021-04-04 13:17:10] reindex_search
[STOP] [2021-04-04 13:17:26] reindex_search
[START] [2021-04-04 13:17:26] normalize_units
[STOP] [2021-04-04 13:17:26] normalize_units
[START] [2021-04-04 13:17:26] calculate_statistics
[STOP] [2021-04-04 13:17:26] calculate_statistics
[START] [2021-04-04 13:17:26] complete_harvest_instance
[START] [2021-04-04 13:17:26] overall_tsv_creation
[INFO] [2021-04-04 13:17:26] Processing group of 17485 in 2 batches of 10000
[INFO] [2021-04-04 13:18:35] 46403 Traits (unfiltered)...
[INFO] [2021-04-04 13:23:59] 46403 Traits (filtered)...
[INFO] [2021-04-04 13:24:02] 0 Associations (filtered)...
[INFO] [2021-04-04 13:24:11] 44415 metadata added.
[INFO] [2021-04-04 13:24:11] 0 metadata added.
[INFO] [2021-04-04 13:25:46] 39096 Traits (unfiltered)...
[INFO] [2021-04-04 13:30:27] 39096 Traits (filtered)...
[INFO] [2021-04-04 13:30:30] 0 Associations (filtered)...
[INFO] [2021-04-04 13:30:36] 37538 metadata added.
[INFO] [2021-04-04 13:30:36] 0 metadata added.
[INFO] [2021-04-04 13:31:11] Average Time: 382.335
[INFO] [2021-04-04 13:31:11] Total Time: 13m45s
[STOP] [2021-04-04 13:31:11] overall_tsv_creation
[INFO] [2021-04-04 13:31:11] Done. Check your files:
[INFO] [2021-04-04 13:31:12] (17407 lines) /app/public/data/griis/publish_nodes.tsv
[INFO] [2021-04-04 13:31:13] (83176 lines) /app/public/data/griis/publish_node_ancestors.tsv
[INFO] [2021-04-04 13:31:14] (17485 lines) /app/public/data/griis/publish_scientific_names.tsv
[INFO] [2021-04-04 13:31:15] (85500 lines) /app/public/data/griis/publish_traits.tsv
[INFO] [2021-04-04 13:31:16] (81954 lines) /app/public/data/griis/publish_metadata.tsv
[STOP] [2021-04-04 13:31:16] complete_harvest_instance
[START] [2021-04-04 13:31:16] completed
[STOP] [2021-04-04 13:31:16] completed
[STOP] [2021-04-04 13:31:16] logged process, took 1422.62

Latest Process