Harvest for
COSEWIC
Created
07 Apr 17:59
Stage:
completed
Fetched:
07 Apr 17:59
Validated:
07 Apr 17:59
Deltas Created
07 Apr 17:59
Units Normalized:
07 Apr 18:00
Ancestry Built:
07 Apr 17:59
Nodes Matched:
07 Apr 18:00
Names Parsed:
07 Apr 17:59
New Models Stored:
07 Apr 17:59
Indexed:
07 Apr 18:00
Completed:
07 Apr 18:02
Time to Harvest:
less than a minute
Harvesting Log
(391 lines)
# Logfile created on 2020-07-10 10:44:48 -0400 by logger.rb/v1.4.2
[INFO] [2020-07-10 10:44:48] ## HARVEST: type = -harvest
[START] [2020-07-10 10:44:49] logged process
[START] [2020-07-10 10:44:49] create_harvest_instance
[STOP] [2020-07-10 10:44:50] create_harvest_instance
[START] [2020-07-10 10:44:50] fetch_files
[STOP] [2020-07-10 10:44:50] fetch_files
[START] [2020-07-10 10:44:50] validate_each_file
[STOP] [2020-07-10 10:44:50] validate_each_file
[START] [2020-07-10 10:44:50] convert_to_csv
[CMD] [2020-07-10 10:44:50] /usr/bin/sort /app/public/converted_csv/COSEWIC_nodes_21601.csv > /app/public/converted_csv/COSEWIC_nodes_21601.csv_sorted
[CMD] [2020-07-10 10:44:50] /usr/bin/sort /app/public/converted_csv/COSEWIC_occurrences_21602.csv > /app/public/converted_csv/COSEWIC_occurrences_21602.csv_sorted
[CMD] [2020-07-10 10:44:51] /usr/bin/sort /app/public/converted_csv/COSEWIC_measurements_21603.csv > /app/public/converted_csv/COSEWIC_measurements_21603.csv_sorted
[STOP] [2020-07-10 10:44:51] convert_to_csv
[START] [2020-07-10 10:44:51] calculate_delta
[CMD] [2020-07-10 10:44:51] echo "0a" > /app/public/diff/COSEWIC_nodes_21601.diff
[CMD] [2020-07-10 10:44:51] tail -n +1 /app/public/converted_csv/COSEWIC_nodes_21601.csv >> /app/public/diff/COSEWIC_nodes_21601.diff
[CMD] [2020-07-10 10:44:51] echo "." >> /app/public/diff/COSEWIC_nodes_21601.diff
[CMD] [2020-07-10 10:44:51] echo "0a" > /app/public/diff/COSEWIC_occurrences_21602.diff
[CMD] [2020-07-10 10:44:51] tail -n +1 /app/public/converted_csv/COSEWIC_occurrences_21602.csv >> /app/public/diff/COSEWIC_occurrences_21602.diff
[CMD] [2020-07-10 10:44:51] echo "." >> /app/public/diff/COSEWIC_occurrences_21602.diff
[CMD] [2020-07-10 10:44:51] echo "0a" > /app/public/diff/COSEWIC_measurements_21603.diff
[CMD] [2020-07-10 10:44:51] tail -n +1 /app/public/converted_csv/COSEWIC_measurements_21603.csv >> /app/public/diff/COSEWIC_measurements_21603.diff
[CMD] [2020-07-10 10:44:51] echo "." >> /app/public/diff/COSEWIC_measurements_21603.diff
[STOP] [2020-07-10 10:44:51] calculate_delta
[START] [2020-07-10 10:44:51] parse_diff_and_store
[INFO] [2020-07-10 10:44:51] Loading nodes diff file into memory (true lines)...
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Accipiter gentilis atricapillus` to `Accipiter gentilis atricapillus`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Accipiter gentilis laingi` to `Accipiter gentilis laingi`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Aegolius acadicus brooksi` to `Aegolius acadicus brooksi`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Ammodramus savannarum pratensis` to `Ammodramus savannarum pratensis`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Anguispira kochi kochi` to `Anguispira kochi kochi`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Anguispira kochi occidentalis` to `Anguispira kochi occidentalis`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Ardea herodias fannini` to `Ardea herodias fannini`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Balaenoptera acutorostrata acutorostrata` to `Balaenoptera acutorostrata acutorostrata`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Balaenoptera acutorostrata scammonii` to `Balaenoptera acutorostrata scammonii`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Bison bison athabascae` to `Bison bison athabascae`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Bison bison bison` to `Bison bison bison`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Bombus occidentalis mckayi` to `Bombus occidentalis mckayi`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Bombus occidentalis occidentalis` to `Bombus occidentalis occidentalis`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Calidris canutus islandica` to `Calidris canutus islandica`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Calidris canutus roselaari` to `Calidris canutus roselaari`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Calidris canutus rufa` to `Calidris canutus rufa`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Canis lupus arctos` to `Canis lupus arctos`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Canis lupus nubilus` to `Canis lupus nubilus`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Canis lupus occidentalis` to `Canis lupus occidentalis`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Centrocercus urophasianus phaios` to `Centrocercus urophasianus phaios`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Centrocercus urophasianus urophasianus` to `Centrocercus urophasianus urophasianus`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Charadrius melodus circumcinctus` to `Charadrius melodus circumcinctus`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Charadrius melodus melodus` to `Charadrius melodus melodus`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Chrysemys picta bellii` to `Chrysemys picta bellii`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Chrysemys picta bellii` to `Chrysemys picta bellii`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Chrysemys picta marginata` to `Chrysemys picta marginata`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Chrysemys picta picta` to `Chrysemys picta picta`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Cicindela formosa gibsoni` to `Cicindela formosa gibsoni`
[WARN] [2020-07-10 10:44:51] Filtered Scientific Name `Cicindela parowana wallisi` to `Cicindela parowana wallisi`
[WARN] [2020-07-10 10:44:51] (Reached filtered-name limit; supressing further warnings.)
[INFO] [2020-07-10 10:44:51] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-07-10 10:44:51] Loading measurements diff file into memory (true lines)...
[INFO] [2020-07-10 10:44:57] Storing 886 ScientificNames
[INFO] [2020-07-10 10:44:57] Processing group of 886 in 1 groups of 1000
[INFO] [2020-07-10 10:44:57] Average Time: 0.33
[INFO] [2020-07-10 10:44:57] Total Time: 1s
[INFO] [2020-07-10 10:44:57] Storing 886 Nodes
[INFO] [2020-07-10 10:44:57] Processing group of 886 in 1 groups of 1000
[INFO] [2020-07-10 10:44:57] Average Time: 0.33
[INFO] [2020-07-10 10:44:57] Total Time: 1s
[INFO] [2020-07-10 10:44:57] Storing 871 Occurrences
[INFO] [2020-07-10 10:44:57] Processing group of 871 in 1 groups of 1000
[INFO] [2020-07-10 10:44:57] Average Time: 0.1
[INFO] [2020-07-10 10:44:57] Total Time: 1s
[INFO] [2020-07-10 10:44:57] Storing 1391 Traits
[INFO] [2020-07-10 10:44:57] Processing group of 1391 in 2 groups of 1000
[INFO] [2020-07-10 10:44:58] Average Time: 0.27
[INFO] [2020-07-10 10:44:58] Total Time: 1s
[INFO] [2020-07-10 10:44:58] Storing 2820 MetaTraits
[INFO] [2020-07-10 10:44:58] Processing group of 2820 in 3 groups of 1000
[INFO] [2020-07-10 10:44:58] Average Time: 0.107
[INFO] [2020-07-10 10:44:58] Total Time: 1s
[STOP] [2020-07-10 10:44:58] parse_diff_and_store
[START] [2020-07-10 10:44:58] resolve_keys
[INFO] [2020-07-10 10:45:07] Occurrences to nodes (through scientific_names)...
[INFO] [2020-07-10 10:45:07] traits to occurrences...
[INFO] [2020-07-10 10:45:07] traits to nodes (through occurrences)...
[INFO] [2020-07-10 10:45:07] Traits to sex term...
[INFO] [2020-07-10 10:45:07] Traits to lifestage term...
[INFO] [2020-07-10 10:45:07] MetaTraits to traits...
[INFO] [2020-07-10 10:45:07] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-07-10 10:45:07] Assocs to occurrences...
[INFO] [2020-07-10 10:45:07] Assocs to nodes...
[INFO] [2020-07-10 10:45:07] Assoc to sex term...
[INFO] [2020-07-10 10:45:07] Assoc to lifestage term...
[STOP] [2020-07-10 10:45:07] resolve_keys
[START] [2020-07-10 10:45:07] hold_for_later_1
[STOP] [2020-07-10 10:45:07] hold_for_later_1
[START] [2020-07-10 10:45:07] hold_for_later_2
[STOP] [2020-07-10 10:45:07] hold_for_later_2
[START] [2020-07-10 10:45:07] resolve_missing_parents
[STOP] [2020-07-10 10:45:08] resolve_missing_parents
[START] [2020-07-10 10:45:08] rebuild_nodes
[START] [2020-07-10 10:45:08] Flattener#flatten
[START] [2020-07-10 10:45:08] Flattener#study_resource
[START] [2020-07-10 10:45:08] Flattener#build_ancestry
[STOP] [2020-07-10 10:45:08] Flattener#build_ancestry
[INFO] [2020-07-10 10:45:08] 886 ancestry keys
[START] [2020-07-10 10:45:08] build_node_ancestors
[INFO] [2020-07-10 10:45:08] old ancestors deleted.
[STOP] [2020-07-10 10:45:08] build_node_ancestors
[START] [2020-07-10 10:45:08] Flattener#propagate_ancestor_ids
[STOP] [2020-07-10 10:45:08] Flattener#propagate_ancestor_ids
[STOP] [2020-07-10 10:45:08] Flattener#flatten
[STOP] [2020-07-10 10:45:08] rebuild_nodes
[START] [2020-07-10 10:45:08] resolve_missing_media_owners
[STOP] [2020-07-10 10:45:08] resolve_missing_media_owners
[START] [2020-07-10 10:45:08] sanitize_media_verbatims
[STOP] [2020-07-10 10:45:08] sanitize_media_verbatims
[START] [2020-07-10 10:45:08] queue_downloads
[STOP] [2020-07-10 10:45:08] queue_downloads
[START] [2020-07-10 10:45:08] parse_names
[WARN] [2020-07-10 10:45:08] I see 886 names which still need to be parsed.
[WARN] [2020-07-10 10:45:10] I see 1 names which still need to be parsed.
[STOP] [2020-07-10 10:45:11] parse_names
[START] [2020-07-10 10:45:11] denormalize_canonical_names_to_nodes
[STOP] [2020-07-10 10:45:11] denormalize_canonical_names_to_nodes
[START] [2020-07-10 10:45:11] match_nodes
[START] [2020-07-10 10:45:11] map_all_nodes_to_pages
[STOP] [2020-07-10 10:46:43] map_all_nodes_to_pages
[INFO] [2020-07-10 10:46:43] 66 Unmatched nodes (of 886)! That's too many to output. First 10: Haplodontium macrocarpum (#80144625); Asplenium scolopendrium americanum (#80144326); Carex nebrascensis (#80144401); Frasera caroliniensis (#80144586); Minuartia pusilla (#80144767); Valeriana edulis ciliata (#80145104); Viola praemorsa praemorsa (#80145110); Erioderma pedicellatum (#80144554); Peltigera gowardii (#80144847); Pseudevernia cladonia (#80144918)
[START] [2020-07-10 10:46:43] update_nodes
[STOP] [2020-07-10 10:46:43] update_nodes
[STOP] [2020-07-10 10:46:43] match_nodes
[START] [2020-07-10 10:46:43] reindex_search
[STOP] [2020-07-10 10:46:44] reindex_search
[START] [2020-07-10 10:46:44] normalize_units
[STOP] [2020-07-10 10:46:44] normalize_units
[START] [2020-07-10 10:46:44] calculate_statistics
[STOP] [2020-07-10 10:46:44] calculate_statistics
[START] [2020-07-10 10:46:44] complete_harvest_instance
[START] [2020-07-10 10:46:44] overall_tsv_creation
[INFO] [2020-07-10 10:46:45] Processing group of 886 in 1 batches of 10000
[INFO] [2020-07-10 10:48:35] 1086 Traits (unfiltered)...
[INFO] [2020-07-10 10:48:48] 1086 Traits (filtered)...
[INFO] [2020-07-10 10:48:48] 0 Associations (filtered)...
[INFO] [2020-07-10 10:49:25] 3125 metadata added.
[INFO] [2020-07-10 10:49:25] 0 metadata added.
[INFO] [2020-07-10 10:49:25] Average Time: 75.97
[INFO] [2020-07-10 10:49:25] Total Time: 2m41s
[STOP] [2020-07-10 10:49:25] overall_tsv_creation
[INFO] [2020-07-10 10:49:25] Done. Check your files:
[INFO] [2020-07-10 10:49:25] (885 lines) /app/public/data/COSEWIC/publish_nodes.tsv
[INFO] [2020-07-10 10:49:25] (2199 lines) /app/public/data/COSEWIC/publish_node_ancestors.tsv
[INFO] [2020-07-10 10:49:25] (886 lines) /app/public/data/COSEWIC/publish_scientific_names.tsv
[INFO] [2020-07-10 10:49:25] (1087 lines) /app/public/data/COSEWIC/publish_traits.tsv
[INFO] [2020-07-10 10:49:25] (3126 lines) /app/public/data/COSEWIC/publish_metadata.tsv
[STOP] [2020-07-10 10:49:25] complete_harvest_instance
[START] [2020-07-10 10:49:26] completed
[STOP] [2020-07-10 10:49:26] completed
[STOP] [2020-07-10 10:49:26] logged process, took 276.21
[INFO] [2021-04-07 17:35:33] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2021-04-07 17:59:26] ## remove_type: ScientificName
[INFO] [2021-04-07 17:59:26] ++ Calling delete_all on 886 instances...
[INFO] [2021-04-07 17:59:26] [17:59:26.815] Removed 886 Scientificnames
[INFO] [2021-04-07 17:59:26] ## remove_type: Vernacular
[INFO] [2021-04-07 17:59:26] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-07 17:59:26] [17:59:26.821] Removed 0 Vernaculars
[INFO] [2021-04-07 17:59:26] ## remove_type: Article
[INFO] [2021-04-07 17:59:26] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-07 17:59:26] [17:59:26.822] Removed 0 Articles
[INFO] [2021-04-07 17:59:26] ## remove_type: Medium
[INFO] [2021-04-07 17:59:26] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-07 17:59:26] [17:59:26.824] Removed 0 Media
[INFO] [2021-04-07 17:59:26] ## remove_type: Trait
[INFO] [2021-04-07 17:59:26] ++ Calling delete_all on 1391 instances...
[INFO] [2021-04-07 17:59:27] [17:59:27.014] Removed 1391 Traits
[INFO] [2021-04-07 17:59:27] ## remove_type: MetaTrait
[INFO] [2021-04-07 17:59:27] ++ Calling delete_all on 2820 instances...
[INFO] [2021-04-07 17:59:27] [17:59:27.167] Removed 2820 Metatraits
[INFO] [2021-04-07 17:59:27] ## remove_type: OccurrenceMetadatum
[INFO] [2021-04-07 17:59:27] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-07 17:59:27] [17:59:27.168] Removed 0 Occurrencemetadata
[INFO] [2021-04-07 17:59:27] ## remove_type: Assoc
[INFO] [2021-04-07 17:59:27] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-07 17:59:27] [17:59:27.170] Removed 0 Assocs
[INFO] [2021-04-07 17:59:27] ## remove_type: MetaAssoc
[INFO] [2021-04-07 17:59:27] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-07 17:59:27] [17:59:27.171] Removed 0 Metaassocs
[INFO] [2021-04-07 17:59:27] ## remove_type: Identifier
[INFO] [2021-04-07 17:59:27] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-07 17:59:27] [17:59:27.173] Removed 0 Identifiers
[INFO] [2021-04-07 17:59:27] ## remove_type: Reference
[INFO] [2021-04-07 17:59:27] ++ Calling delete_all on 0 instances...
[INFO] [2021-04-07 17:59:27] [17:59:27.174] Removed 0 References
[INFO] [2021-04-07 17:59:27] Starting batch with ID 80144234...
[INFO] [2021-04-07 17:59:27] Starting batch with ID 80144234...
[INFO] [2021-04-07 17:59:27] Starting batch with ID 80144234...
[INFO] [2021-04-07 17:59:27] Starting batch with ID 80144234...
[INFO] [2021-04-07 17:59:27] Starting batch with ID 80144234...
[INFO] [2021-04-07 17:59:27] Starting batch with ID 80144234...
[INFO] [2021-04-07 17:59:27] Starting batch with ID 80144234...
[INFO] [2021-04-07 17:59:27] Starting batch with ID 80144234...
[INFO] [2021-04-07 17:59:29] ## remove_type: Node
[INFO] [2021-04-07 17:59:29] ++ Calling delete_all on 886 instances...
[INFO] [2021-04-07 17:59:29] [17:59:29.087] Removed 886 Nodes
[START] [2021-04-07 17:59:29] logged process: 5ecc716a6a5541910d0c854f5a0c8d1651b82ad0 Improved MetaXml.ignore and added publisher to media (ignored)
[START] [2021-04-07 17:59:30] Creating resource from OpenData
[START] [2021-04-07 17:59:31] logged process: 5ecc716a6a5541910d0c854f5a0c8d1651b82ad0 Improved MetaXml.ignore and added publisher to media (ignored)
[START] [2021-04-07 17:59:31] Parse meta.xml file and create formats with fields
[STOP] [2021-04-07 17:59:36] Parse meta.xml file and create formats with fields
[STOP] [2021-04-07 17:59:36] Creating resource from OpenData
[START] [2021-04-07 17:59:36] logged process: 5ecc716a6a5541910d0c854f5a0c8d1651b82ad0 Improved MetaXml.ignore and added publisher to media (ignored)
[START] [2021-04-07 17:59:36] create_harvest_instance
[INFO] [2021-04-07 17:59:36] Created harvest instance #3693
[STOP] [2021-04-07 17:59:36] create_harvest_instance
[START] [2021-04-07 17:59:36] fetch_files
[STOP] [2021-04-07 17:59:36] fetch_files
[START] [2021-04-07 17:59:36] validate_each_file
[INFO] [2021-04-07 17:59:36] Looping over 3 formats...
[INFO] [2021-04-07 17:59:36] ...nodes (/app/public/data/COSEWIC/taxa.txt)
[INFO] [2021-04-07 17:59:36] Valid: /app/public/converted_csv/COSEWIC_nodes_3693.csv (886 lines)
[INFO] [2021-04-07 17:59:36] ...occurrences (/app/public/data/COSEWIC/occurrences.txt)
[INFO] [2021-04-07 17:59:36] Valid: /app/public/converted_csv/COSEWIC_occurrences_3693.csv (871 lines)
[INFO] [2021-04-07 17:59:36] ...measurements (/app/public/data/COSEWIC/measurementsorfacts.txt)
[INFO] [2021-04-07 17:59:36] Valid: /app/public/converted_csv/COSEWIC_measurements_3693.csv (1391 lines)
[STOP] [2021-04-07 17:59:36] validate_each_file
[START] [2021-04-07 17:59:36] convert_to_csv
[INFO] [2021-04-07 17:59:36] Looping over 3 formats...
[INFO] [2021-04-07 17:59:36] ...nodes (/app/public/data/COSEWIC/taxa.txt)
[CMD] [2021-04-07 17:59:36] /usr/bin/sort /app/public/converted_csv/COSEWIC_nodes_3693.csv > /app/public/converted_csv/COSEWIC_nodes_3693.csv_sorted
[INFO] [2021-04-07 17:59:37] Converted: /app/public/converted_csv/COSEWIC_nodes_3693.csv (886 lines)
[INFO] [2021-04-07 17:59:37] ...occurrences (/app/public/data/COSEWIC/occurrences.txt)
[CMD] [2021-04-07 17:59:37] /usr/bin/sort /app/public/converted_csv/COSEWIC_occurrences_3693.csv > /app/public/converted_csv/COSEWIC_occurrences_3693.csv_sorted
[INFO] [2021-04-07 17:59:38] Converted: /app/public/converted_csv/COSEWIC_occurrences_3693.csv (871 lines)
[INFO] [2021-04-07 17:59:38] ...measurements (/app/public/data/COSEWIC/measurementsorfacts.txt)
[CMD] [2021-04-07 17:59:38] /usr/bin/sort /app/public/converted_csv/COSEWIC_measurements_3693.csv > /app/public/converted_csv/COSEWIC_measurements_3693.csv_sorted
[INFO] [2021-04-07 17:59:38] Converted: /app/public/converted_csv/COSEWIC_measurements_3693.csv (1391 lines)
[STOP] [2021-04-07 17:59:38] convert_to_csv
[START] [2021-04-07 17:59:38] calculate_delta
[INFO] [2021-04-07 17:59:38] Looping over 3 formats...
[INFO] [2021-04-07 17:59:38] ...nodes (/app/public/data/COSEWIC/taxa.txt)
[CMD] [2021-04-07 17:59:38] echo "0a" > /app/public/diff/COSEWIC_nodes_3693.diff
[CMD] [2021-04-07 17:59:39] tail -n +1 /app/public/converted_csv/COSEWIC_nodes_3693.csv >> /app/public/diff/COSEWIC_nodes_3693.diff
[CMD] [2021-04-07 17:59:39] echo "." >> /app/public/diff/COSEWIC_nodes_3693.diff
[INFO] [2021-04-07 17:59:40] Created diff: /app/public/diff/COSEWIC_nodes_3693.diff (888 lines)
[INFO] [2021-04-07 17:59:40] ...occurrences (/app/public/data/COSEWIC/occurrences.txt)
[CMD] [2021-04-07 17:59:40] echo "0a" > /app/public/diff/COSEWIC_occurrences_3693.diff
[CMD] [2021-04-07 17:59:40] tail -n +1 /app/public/converted_csv/COSEWIC_occurrences_3693.csv >> /app/public/diff/COSEWIC_occurrences_3693.diff
[CMD] [2021-04-07 17:59:41] echo "." >> /app/public/diff/COSEWIC_occurrences_3693.diff
[INFO] [2021-04-07 17:59:41] Created diff: /app/public/diff/COSEWIC_occurrences_3693.diff (873 lines)
[INFO] [2021-04-07 17:59:41] ...measurements (/app/public/data/COSEWIC/measurementsorfacts.txt)
[CMD] [2021-04-07 17:59:41] echo "0a" > /app/public/diff/COSEWIC_measurements_3693.diff
[CMD] [2021-04-07 17:59:42] tail -n +1 /app/public/converted_csv/COSEWIC_measurements_3693.csv >> /app/public/diff/COSEWIC_measurements_3693.diff
[CMD] [2021-04-07 17:59:43] echo "." >> /app/public/diff/COSEWIC_measurements_3693.diff
[INFO] [2021-04-07 17:59:43] Created diff: /app/public/diff/COSEWIC_measurements_3693.diff (1393 lines)
[STOP] [2021-04-07 17:59:43] calculate_delta
[START] [2021-04-07 17:59:43] parse_diff_and_store
[INFO] [2021-04-07 17:59:43] Handling diff: /app/public/diff/COSEWIC_nodes_3693.diff (888 lines)
[INFO] [2021-04-07 17:59:44] Loading nodes diff file into memory (888 /app/public/diff/COSEWIC_nodes_3693.diff lines)...
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Accipiter gentilis atricapillus` to `Accipiter gentilis atricapillus`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Accipiter gentilis laingi` to `Accipiter gentilis laingi`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Aegolius acadicus brooksi` to `Aegolius acadicus brooksi`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Ammodramus savannarum pratensis` to `Ammodramus savannarum pratensis`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Anguispira kochi kochi` to `Anguispira kochi kochi`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Anguispira kochi occidentalis` to `Anguispira kochi occidentalis`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Ardea herodias fannini` to `Ardea herodias fannini`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Balaenoptera acutorostrata acutorostrata` to `Balaenoptera acutorostrata acutorostrata`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Balaenoptera acutorostrata scammonii` to `Balaenoptera acutorostrata scammonii`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Bison bison athabascae` to `Bison bison athabascae`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Bison bison bison` to `Bison bison bison`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Bombus occidentalis mckayi` to `Bombus occidentalis mckayi`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Bombus occidentalis occidentalis` to `Bombus occidentalis occidentalis`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Calidris canutus islandica` to `Calidris canutus islandica`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Calidris canutus roselaari` to `Calidris canutus roselaari`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Calidris canutus rufa` to `Calidris canutus rufa`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Canis lupus arctos` to `Canis lupus arctos`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Canis lupus nubilus` to `Canis lupus nubilus`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Canis lupus occidentalis` to `Canis lupus occidentalis`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Centrocercus urophasianus phaios` to `Centrocercus urophasianus phaios`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Centrocercus urophasianus urophasianus` to `Centrocercus urophasianus urophasianus`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Charadrius melodus circumcinctus` to `Charadrius melodus circumcinctus`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Charadrius melodus melodus` to `Charadrius melodus melodus`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Chrysemys picta bellii` to `Chrysemys picta bellii`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Chrysemys picta bellii` to `Chrysemys picta bellii`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Chrysemys picta marginata` to `Chrysemys picta marginata`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Chrysemys picta picta` to `Chrysemys picta picta`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Cicindela formosa gibsoni` to `Cicindela formosa gibsoni`
[WARN] [2021-04-07 17:59:44] Filtered Scientific Name `Cicindela parowana wallisi` to `Cicindela parowana wallisi`
[WARN] [2021-04-07 17:59:44] (Reached filtered-name limit; supressing further warnings.)
[INFO] [2021-04-07 17:59:44] Handling diff: /app/public/diff/COSEWIC_occurrences_3693.diff (873 lines)
[INFO] [2021-04-07 17:59:45] Loading occurrences diff file into memory (873 /app/public/diff/COSEWIC_occurrences_3693.diff lines)...
[INFO] [2021-04-07 17:59:46] Handling diff: /app/public/diff/COSEWIC_measurements_3693.diff (1393 lines)
[INFO] [2021-04-07 17:59:46] Loading measurements diff file into memory (1393 /app/public/diff/COSEWIC_measurements_3693.diff lines)...
[INFO] [2021-04-07 17:59:47] Storing 886 ScientificNames
[INFO] [2021-04-07 17:59:47] Processing group of 886 in 1 groups of 1000
[INFO] [2021-04-07 17:59:47] Average Time: 0.26
[INFO] [2021-04-07 17:59:47] Total Time: 1s
[INFO] [2021-04-07 17:59:47] Storing 886 Nodes
[INFO] [2021-04-07 17:59:47] Processing group of 886 in 1 groups of 1000
[INFO] [2021-04-07 17:59:48] Average Time: 0.24
[INFO] [2021-04-07 17:59:48] Total Time: 1s
[INFO] [2021-04-07 17:59:48] Storing 871 Occurrences
[INFO] [2021-04-07 17:59:48] Processing group of 871 in 1 groups of 1000
[INFO] [2021-04-07 17:59:48] Average Time: 0.1
[INFO] [2021-04-07 17:59:48] Total Time: 1s
[INFO] [2021-04-07 17:59:48] Storing 1391 Traits
[INFO] [2021-04-07 17:59:48] Processing group of 1391 in 2 groups of 1000
[INFO] [2021-04-07 17:59:48] Average Time: 0.2
[INFO] [2021-04-07 17:59:48] Total Time: 1s
[INFO] [2021-04-07 17:59:48] Storing 648 MetaTraits
[INFO] [2021-04-07 17:59:48] Processing group of 648 in 1 groups of 1000
[INFO] [2021-04-07 17:59:48] Average Time: 0.07
[INFO] [2021-04-07 17:59:48] Total Time: 1s
[STOP] [2021-04-07 17:59:48] parse_diff_and_store
[START] [2021-04-07 17:59:48] resolve_keys
[INFO] [2021-04-07 17:59:54] Occurrences to nodes (through scientific_names)...
[INFO] [2021-04-07 17:59:54] traits to occurrences...
[INFO] [2021-04-07 17:59:55] traits to nodes (through occurrences)...
[INFO] [2021-04-07 17:59:55] Traits to sex term...
[INFO] [2021-04-07 17:59:55] Traits to lifestage term...
[INFO] [2021-04-07 17:59:55] MetaTraits to traits...
[INFO] [2021-04-07 17:59:55] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2021-04-07 17:59:55] Assocs to occurrences...
[INFO] [2021-04-07 17:59:55] Assocs to nodes...
[INFO] [2021-04-07 17:59:55] Assoc to sex term...
[INFO] [2021-04-07 17:59:55] Assoc to lifestage term...
[INFO] [2021-04-07 17:59:55] MetaAssoc to assocs...
[STOP] [2021-04-07 17:59:55] resolve_keys
[START] [2021-04-07 17:59:55] hold_for_later_1
[STOP] [2021-04-07 17:59:55] hold_for_later_1
[START] [2021-04-07 17:59:55] hold_for_later_2
[STOP] [2021-04-07 17:59:55] hold_for_later_2
[START] [2021-04-07 17:59:55] resolve_missing_parents
[STOP] [2021-04-07 17:59:55] resolve_missing_parents
[START] [2021-04-07 17:59:55] rebuild_nodes
[START] [2021-04-07 17:59:55] Flattener#flatten
[START] [2021-04-07 17:59:55] Flattener#study_resource
[START] [2021-04-07 17:59:55] Flattener#build_ancestry
[STOP] [2021-04-07 17:59:55] Flattener#build_ancestry
[INFO] [2021-04-07 17:59:55] 886 ancestry keys
[START] [2021-04-07 17:59:55] build_node_ancestors
[INFO] [2021-04-07 17:59:55] old ancestors deleted.
[STOP] [2021-04-07 17:59:55] build_node_ancestors
[START] [2021-04-07 17:59:55] Flattener#propagate_ancestor_ids
[STOP] [2021-04-07 17:59:55] Flattener#propagate_ancestor_ids
[STOP] [2021-04-07 17:59:55] Flattener#flatten
[STOP] [2021-04-07 17:59:55] rebuild_nodes
[START] [2021-04-07 17:59:55] resolve_missing_media_owners
[STOP] [2021-04-07 17:59:55] resolve_missing_media_owners
[START] [2021-04-07 17:59:55] sanitize_media_verbatims
[STOP] [2021-04-07 17:59:55] sanitize_media_verbatims
[START] [2021-04-07 17:59:55] queue_downloads
[STOP] [2021-04-07 17:59:55] queue_downloads
[START] [2021-04-07 17:59:55] parse_names
[WARN] [2021-04-07 17:59:55] I see 886 names which still need to be parsed.
[WARN] [2021-04-07 17:59:57] I see 1 names which still need to be parsed.
[STOP] [2021-04-07 17:59:58] parse_names
[START] [2021-04-07 17:59:58] denormalize_canonical_names_to_nodes
[STOP] [2021-04-07 17:59:58] denormalize_canonical_names_to_nodes
[START] [2021-04-07 17:59:58] match_nodes
[START] [2021-04-07 17:59:58] map_all_nodes_to_pages
[STOP] [2021-04-07 18:00:21] map_all_nodes_to_pages
[INFO] [2021-04-07 18:00:21] 67 Unmatched nodes (of 886)! That's too many to output. Full list in /app/public/data/COSEWIC/unmatched_nodes.txt ; First 10: Haplodontium macrocarpum (#92667154); Asplenium scolopendrium americanum (#92666855); Carex nebrascensis (#92666930); Draba puvirnituqii (#92667047); Frasera caroliniensis (#92667115); Minuartia pusilla (#92667296); Viola praemorsa praemorsa (#92667639); Erioderma pedicellatum (#92667083); Peltigera gowardii (#92667376); Pseudevernia cladonia (#92667447)
[START] [2021-04-07 18:00:21] update_nodes
[STOP] [2021-04-07 18:00:22] update_nodes
[STOP] [2021-04-07 18:00:22] match_nodes
[START] [2021-04-07 18:00:22] reindex_search
[STOP] [2021-04-07 18:00:22] reindex_search
[START] [2021-04-07 18:00:22] normalize_units
[STOP] [2021-04-07 18:00:22] normalize_units
[START] [2021-04-07 18:00:22] calculate_statistics
[STOP] [2021-04-07 18:00:22] calculate_statistics
[START] [2021-04-07 18:00:22] complete_harvest_instance
[START] [2021-04-07 18:00:22] overall_tsv_creation
[INFO] [2021-04-07 18:00:22] Processing group of 886 in 1 batches of 10000
[INFO] [2021-04-07 18:01:02] 1086 Traits (unfiltered)...
[INFO] [2021-04-07 18:01:43] 1086 Traits (filtered)...
[INFO] [2021-04-07 18:01:43] 0 Associations (filtered)...
[INFO] [2021-04-07 18:01:43] 305 metadata added.
[INFO] [2021-04-07 18:01:43] 0 metadata added.
[INFO] [2021-04-07 18:02:10] Average Time: 82.74
[INFO] [2021-04-07 18:02:10] Total Time: 1m48s
[STOP] [2021-04-07 18:02:10] overall_tsv_creation
[INFO] [2021-04-07 18:02:10] Done. Check your files:
[INFO] [2021-04-07 18:02:10] (885 lines) /app/public/data/COSEWIC/publish_nodes.tsv
[INFO] [2021-04-07 18:02:11] (2199 lines) /app/public/data/COSEWIC/publish_node_ancestors.tsv
[INFO] [2021-04-07 18:02:11] (886 lines) /app/public/data/COSEWIC/publish_scientific_names.tsv
[INFO] [2021-04-07 18:02:12] (1087 lines) /app/public/data/COSEWIC/publish_traits.tsv
[INFO] [2021-04-07 18:02:13] (306 lines) /app/public/data/COSEWIC/publish_metadata.tsv
[STOP] [2021-04-07 18:02:13] complete_harvest_instance
[START] [2021-04-07 18:02:13] completed
[STOP] [2021-04-07 18:02:13] completed
[STOP] [2021-04-07 18:02:13] logged process, took 156.9
Latest Process