Harvest for
Antweb
Created
29 Dec 11:28
Stage:
completed
Fetched:
29 Dec 11:28
Validated:
29 Dec 11:28
Deltas Created
29 Dec 11:28
Units Normalized:
29 Dec 11:43
Ancestry Built:
29 Dec 11:37
Nodes Matched:
29 Dec 11:42
Names Parsed:
29 Dec 11:37
New Models Stored:
29 Dec 11:35
Indexed:
29 Dec 11:43
Completed:
29 Dec 11:58
Time to Harvest:
1 minute
Harvesting Log
(309 lines)
[INFO] [2022-12-29 11:28:26] Created harvest instance #4245
[STOP] [2022-12-29 11:28:26] create_harvest_instance
[START] [2022-12-29 11:28:26] fetch_files
[STOP] [2022-12-29 11:28:26] fetch_files
[START] [2022-12-29 11:28:26] validate_each_file
[INFO] [2022-12-29 11:28:26] Created new folder: /app/public/converted_csv
[INFO] [2022-12-29 11:28:26] Looping over 5 formats...
[INFO] [2022-12-29 11:28:26] ...agents (/app/public/data/antweb/agent.tab)
[INFO] [2022-12-29 11:28:26] Valid: /app/public/data/antweb/converted_csv/antweb_agents_29981.csv (173 lines)
[INFO] [2022-12-29 11:28:26] ...nodes (/app/public/data/antweb/taxon.tab)
[INFO] [2022-12-29 11:28:26] Valid: /app/public/data/antweb/converted_csv/antweb_nodes_29979.csv (16470 lines)
[INFO] [2022-12-29 11:28:26] ...media (/app/public/data/antweb/media_resource.tab)
[INFO] [2022-12-29 11:28:37] Valid: /app/public/data/antweb/converted_csv/antweb_media_29980.csv (140143 lines)
[INFO] [2022-12-29 11:28:37] ...occurrences (/app/public/data/antweb/occurrence_specific.tab)
[INFO] [2022-12-29 11:28:37] Valid: /app/public/data/antweb/converted_csv/antweb_occurrences_29982.csv (58339 lines)
[INFO] [2022-12-29 11:28:37] ...measurements (/app/public/data/antweb/measurement_or_fact_specific.tab)
[INFO] [2022-12-29 11:28:43] Valid: /app/public/data/antweb/converted_csv/antweb_measurements_29983.csv (89324 lines)
[STOP] [2022-12-29 11:28:43] validate_each_file
[START] [2022-12-29 11:28:43] convert_to_csv
[INFO] [2022-12-29 11:28:43] Looping over 5 formats...
[INFO] [2022-12-29 11:28:43] ...agents (/app/public/data/antweb/agent.tab)
[CMD] [2022-12-29 11:28:43] /usr/bin/sort /app/public/data/antweb/converted_csv/antweb_agents_29981.csv > /app/public/data/antweb/converted_csv/antweb_agents_29981.csv_sorted
[INFO] [2022-12-29 11:28:43] Converted: /app/public/data/antweb/converted_csv/antweb_agents_29981.csv (173 lines)
[INFO] [2022-12-29 11:28:43] ...nodes (/app/public/data/antweb/taxon.tab)
[CMD] [2022-12-29 11:28:43] /usr/bin/sort /app/public/data/antweb/converted_csv/antweb_nodes_29979.csv > /app/public/data/antweb/converted_csv/antweb_nodes_29979.csv_sorted
[INFO] [2022-12-29 11:28:43] Converted: /app/public/data/antweb/converted_csv/antweb_nodes_29979.csv (16470 lines)
[INFO] [2022-12-29 11:28:43] ...media (/app/public/data/antweb/media_resource.tab)
[CMD] [2022-12-29 11:28:43] /usr/bin/sort /app/public/data/antweb/converted_csv/antweb_media_29980.csv > /app/public/data/antweb/converted_csv/antweb_media_29980.csv_sorted
[INFO] [2022-12-29 11:28:46] Converted: /app/public/data/antweb/converted_csv/antweb_media_29980.csv (140143 lines)
[INFO] [2022-12-29 11:28:46] ...occurrences (/app/public/data/antweb/occurrence_specific.tab)
[CMD] [2022-12-29 11:28:46] /usr/bin/sort /app/public/data/antweb/converted_csv/antweb_occurrences_29982.csv > /app/public/data/antweb/converted_csv/antweb_occurrences_29982.csv_sorted
[INFO] [2022-12-29 11:28:46] Converted: /app/public/data/antweb/converted_csv/antweb_occurrences_29982.csv (58339 lines)
[INFO] [2022-12-29 11:28:46] ...measurements (/app/public/data/antweb/measurement_or_fact_specific.tab)
[CMD] [2022-12-29 11:28:46] /usr/bin/sort /app/public/data/antweb/converted_csv/antweb_measurements_29983.csv > /app/public/data/antweb/converted_csv/antweb_measurements_29983.csv_sorted
[INFO] [2022-12-29 11:28:47] Converted: /app/public/data/antweb/converted_csv/antweb_measurements_29983.csv (89324 lines)
[STOP] [2022-12-29 11:28:47] convert_to_csv
[START] [2022-12-29 11:28:47] calculate_delta
[INFO] [2022-12-29 11:28:47] Created diff dir: /app/public/diff
[INFO] [2022-12-29 11:28:47] Looping over 5 formats...
[INFO] [2022-12-29 11:28:47] ...agents (/app/public/data/antweb/agent.tab)
[CMD] [2022-12-29 11:28:47] echo "0a" > /app/public/data/antweb/diff/antweb_agents_29981.diff
[CMD] [2022-12-29 11:28:48] tail -n +1 /app/public/data/antweb/converted_csv/antweb_agents_29981.csv >> /app/public/data/antweb/diff/antweb_agents_29981.diff
[CMD] [2022-12-29 11:28:48] echo "." >> /app/public/data/antweb/diff/antweb_agents_29981.diff
[INFO] [2022-12-29 11:28:48] Created diff: /app/public/data/antweb/diff/antweb_agents_29981.diff (175 lines)
[INFO] [2022-12-29 11:28:48] ...nodes (/app/public/data/antweb/taxon.tab)
[CMD] [2022-12-29 11:28:48] echo "0a" > /app/public/data/antweb/diff/antweb_nodes_29979.diff
[CMD] [2022-12-29 11:28:48] tail -n +1 /app/public/data/antweb/converted_csv/antweb_nodes_29979.csv >> /app/public/data/antweb/diff/antweb_nodes_29979.diff
[CMD] [2022-12-29 11:28:48] echo "." >> /app/public/data/antweb/diff/antweb_nodes_29979.diff
[INFO] [2022-12-29 11:28:48] Created diff: /app/public/data/antweb/diff/antweb_nodes_29979.diff (16472 lines)
[INFO] [2022-12-29 11:28:48] ...media (/app/public/data/antweb/media_resource.tab)
[CMD] [2022-12-29 11:28:48] echo "0a" > /app/public/data/antweb/diff/antweb_media_29980.diff
[CMD] [2022-12-29 11:28:48] tail -n +1 /app/public/data/antweb/converted_csv/antweb_media_29980.csv >> /app/public/data/antweb/diff/antweb_media_29980.diff
[CMD] [2022-12-29 11:28:49] echo "." >> /app/public/data/antweb/diff/antweb_media_29980.diff
[INFO] [2022-12-29 11:28:50] Created diff: /app/public/data/antweb/diff/antweb_media_29980.diff (140145 lines)
[INFO] [2022-12-29 11:28:50] ...occurrences (/app/public/data/antweb/occurrence_specific.tab)
[CMD] [2022-12-29 11:28:50] echo "0a" > /app/public/data/antweb/diff/antweb_occurrences_29982.diff
[CMD] [2022-12-29 11:28:50] tail -n +1 /app/public/data/antweb/converted_csv/antweb_occurrences_29982.csv >> /app/public/data/antweb/diff/antweb_occurrences_29982.diff
[CMD] [2022-12-29 11:28:50] echo "." >> /app/public/data/antweb/diff/antweb_occurrences_29982.diff
[INFO] [2022-12-29 11:28:50] Created diff: /app/public/data/antweb/diff/antweb_occurrences_29982.diff (58341 lines)
[INFO] [2022-12-29 11:28:50] ...measurements (/app/public/data/antweb/measurement_or_fact_specific.tab)
[CMD] [2022-12-29 11:28:50] echo "0a" > /app/public/data/antweb/diff/antweb_measurements_29983.diff
[CMD] [2022-12-29 11:28:50] tail -n +1 /app/public/data/antweb/converted_csv/antweb_measurements_29983.csv >> /app/public/data/antweb/diff/antweb_measurements_29983.diff
[CMD] [2022-12-29 11:28:50] echo "." >> /app/public/data/antweb/diff/antweb_measurements_29983.diff
[INFO] [2022-12-29 11:28:50] Created diff: /app/public/data/antweb/diff/antweb_measurements_29983.diff (89326 lines)
[STOP] [2022-12-29 11:28:50] calculate_delta
[START] [2022-12-29 11:28:50] parse_diff_and_store
[INFO] [2022-12-29 11:28:50] Handling diff: /app/public/data/antweb/diff/antweb_agents_29981.diff (175 lines)
[INFO] [2022-12-29 11:28:50] Loading agents diff file into memory (175 lines)...
[INFO] [2022-12-29 11:28:50] Storing 173 Attributions (173/173/175)
[INFO] [2022-12-29 11:28:50] Handling diff: /app/public/data/antweb/diff/antweb_nodes_29979.diff (16472 lines)
[INFO] [2022-12-29 11:28:50] Loading nodes diff file into memory (16472 lines)...
[INFO] [2022-12-29 11:28:53] Storing 10003 ScientificNames (20006/10000/16472)
[INFO] [2022-12-29 11:28:56] Storing 10003 Nodes (20006/10000/16472)
[WARN] [2022-12-29 11:29:01] SKIPPED 4 Scientific names (32956/16470/16472) with resource_pks already be in the database!
[WARN] [2022-12-29 11:29:01] SKIPPED 4 Nodes (32956/16470/16472) with resource_pks already be in the database!
[INFO] [2022-12-29 11:29:01] Storing 6471 ScientificNames (32956/16470/16472)
[INFO] [2022-12-29 11:29:03] Storing 6471 Nodes (32956/16470/16472)
[INFO] [2022-12-29 11:29:04] Handling diff: /app/public/data/antweb/diff/antweb_media_29980.diff (140145 lines)
[INFO] [2022-12-29 11:29:04] Loading media diff file into memory (140145 lines)...
[INFO] [2022-12-29 11:29:20] Storing 9999 BibliographicCitations (29997/10000/140145)
[INFO] [2022-12-29 11:29:21] Storing 6034 ContentAttributions (29997/10000/140145)
[INFO] [2022-12-29 11:29:22] Storing 6034 Media (29997/10000/140145)
[INFO] [2022-12-29 11:29:24] Storing 3965 ArticlesSections (29997/10000/140145)
[INFO] [2022-12-29 11:29:24] Storing 3965 Articles (29997/10000/140145)
[INFO] [2022-12-29 11:29:36] Storing 10000 BibliographicCitations (59997/20000/140145)
[INFO] [2022-12-29 11:29:37] Storing 10000 ContentAttributions (59997/20000/140145)
[INFO] [2022-12-29 11:29:38] Storing 10000 Media (59997/20000/140145)
[INFO] [2022-12-29 11:29:51] Storing 10000 BibliographicCitations (89997/30000/140145)
[INFO] [2022-12-29 11:29:53] Storing 10000 ContentAttributions (89997/30000/140145)
[INFO] [2022-12-29 11:29:54] Storing 10000 Media (89997/30000/140145)
[INFO] [2022-12-29 11:30:07] Storing 10000 BibliographicCitations (119997/40000/140145)
[INFO] [2022-12-29 11:30:09] Storing 10000 ContentAttributions (119997/40000/140145)
[INFO] [2022-12-29 11:30:10] Storing 10000 Media (119997/40000/140145)
[INFO] [2022-12-29 11:30:23] Storing 10000 BibliographicCitations (149997/50000/140145)
[INFO] [2022-12-29 11:30:26] Storing 10000 ContentAttributions (149997/50000/140145)
[INFO] [2022-12-29 11:30:27] Storing 10000 Media (149997/50000/140145)
[INFO] [2022-12-29 11:30:41] Storing 10000 BibliographicCitations (179997/60000/140145)
[INFO] [2022-12-29 11:30:42] Storing 10000 ContentAttributions (179997/60000/140145)
[INFO] [2022-12-29 11:30:43] Storing 10000 Media (179997/60000/140145)
[INFO] [2022-12-29 11:30:57] Storing 10000 BibliographicCitations (209997/70000/140145)
[INFO] [2022-12-29 11:30:59] Storing 10000 ContentAttributions (209997/70000/140145)
[INFO] [2022-12-29 11:30:59] Storing 10000 Media (209997/70000/140145)
[INFO] [2022-12-29 11:31:13] Storing 10000 BibliographicCitations (239997/80000/140145)
[INFO] [2022-12-29 11:31:15] Storing 10000 ContentAttributions (239997/80000/140145)
[INFO] [2022-12-29 11:31:16] Storing 10000 Media (239997/80000/140145)
[INFO] [2022-12-29 11:31:30] Storing 10000 BibliographicCitations (269997/90000/140145)
[INFO] [2022-12-29 11:31:31] Storing 10000 ContentAttributions (269997/90000/140145)
[INFO] [2022-12-29 11:31:32] Storing 10000 Media (269997/90000/140145)
[INFO] [2022-12-29 11:31:46] Storing 10000 BibliographicCitations (299997/100000/140145)
[INFO] [2022-12-29 11:31:48] Storing 10000 ContentAttributions (299997/100000/140145)
[INFO] [2022-12-29 11:31:48] Storing 10000 Media (299997/100000/140145)
[INFO] [2022-12-29 11:32:06] Storing 10000 BibliographicCitations (329997/110000/140145)
[INFO] [2022-12-29 11:32:08] Storing 7395 ContentAttributions (329997/110000/140145)
[INFO] [2022-12-29 11:32:09] Storing 7395 Media (329997/110000/140145)
[INFO] [2022-12-29 11:32:11] Storing 2605 ArticlesSections (329997/110000/140145)
[INFO] [2022-12-29 11:32:11] Storing 2605 Articles (329997/110000/140145)
[INFO] [2022-12-29 11:32:23] Storing 10000 BibliographicCitations (359997/120000/140145)
[INFO] [2022-12-29 11:32:25] Storing 9623 ContentAttributions (359997/120000/140145)
[INFO] [2022-12-29 11:32:25] Storing 9623 Media (359997/120000/140145)
[INFO] [2022-12-29 11:32:29] Storing 377 ArticlesSections (359997/120000/140145)
[INFO] [2022-12-29 11:32:29] Storing 377 Articles (359997/120000/140145)
[INFO] [2022-12-29 11:32:48] Storing 10000 BibliographicCitations (389997/130000/140145)
[INFO] [2022-12-29 11:32:50] Storing 3866 ContentAttributions (389997/130000/140145)
[INFO] [2022-12-29 11:32:50] Storing 3866 Media (389997/130000/140145)
[INFO] [2022-12-29 11:32:52] Storing 6134 ArticlesSections (389997/130000/140145)
[INFO] [2022-12-29 11:32:52] Storing 6134 Articles (389997/130000/140145)
[INFO] [2022-12-29 11:33:13] Storing 10000 BibliographicCitations (419997/140000/140145)
[INFO] [2022-12-29 11:33:15] Storing 7485 ArticlesSections (419997/140000/140145)
[INFO] [2022-12-29 11:33:15] Storing 7485 Articles (419997/140000/140145)
[INFO] [2022-12-29 11:33:18] Storing 2515 ContentAttributions (419997/140000/140145)
[INFO] [2022-12-29 11:33:18] Storing 2515 Media (419997/140000/140145)
[INFO] [2022-12-29 11:33:21] Storing 144 BibliographicCitations (420429/140143/140145)
[INFO] [2022-12-29 11:33:21] Storing 67 ArticlesSections (420429/140143/140145)
[INFO] [2022-12-29 11:33:21] Storing 67 Articles (420429/140143/140145)
[INFO] [2022-12-29 11:33:21] Storing 77 ContentAttributions (420429/140143/140145)
[INFO] [2022-12-29 11:33:21] Storing 77 Media (420429/140143/140145)
[INFO] [2022-12-29 11:33:21] Handling diff: /app/public/data/antweb/diff/antweb_occurrences_29982.diff (58341 lines)
[INFO] [2022-12-29 11:33:21] Loading occurrences diff file into memory (58341 lines)...
[INFO] [2022-12-29 11:33:22] Storing 9999 Occurrences (9999/10000/58341)
[INFO] [2022-12-29 11:33:24] Storing 10000 Occurrences (19999/20000/58341)
[INFO] [2022-12-29 11:33:26] Storing 10000 Occurrences (29999/30000/58341)
[INFO] [2022-12-29 11:33:28] Storing 10000 Occurrences (39999/40000/58341)
[INFO] [2022-12-29 11:33:30] Storing 10000 Occurrences (49999/50000/58341)
[INFO] [2022-12-29 11:33:33] Storing 8340 Occurrences (58339/58339/58341)
[INFO] [2022-12-29 11:33:34] Handling diff: /app/public/data/antweb/diff/antweb_measurements_29983.diff (89326 lines)
[INFO] [2022-12-29 11:33:34] Loading measurements diff file into memory (89326 lines)...
[INFO] [2022-12-29 11:33:40] Storing 9999 Traits (29887/10000/89326)
[INFO] [2022-12-29 11:33:46] Storing 19888 MetaTraits (29887/10000/89326)
[INFO] [2022-12-29 11:33:54] Storing 10000 Traits (59778/20000/89326)
[INFO] [2022-12-29 11:33:59] Storing 19891 MetaTraits (59778/20000/89326)
[INFO] [2022-12-29 11:34:06] Storing 10000 Traits (89670/30000/89326)
[INFO] [2022-12-29 11:34:11] Storing 19892 MetaTraits (89670/30000/89326)
[INFO] [2022-12-29 11:34:18] Storing 10000 Traits (119547/40000/89326)
[INFO] [2022-12-29 11:34:23] Storing 19877 MetaTraits (119547/40000/89326)
[INFO] [2022-12-29 11:34:30] Storing 10000 Traits (149448/50000/89326)
[INFO] [2022-12-29 11:34:35] Storing 19901 MetaTraits (149448/50000/89326)
[INFO] [2022-12-29 11:34:43] Storing 10000 Traits (179341/60000/89326)
[INFO] [2022-12-29 11:34:47] Storing 19893 MetaTraits (179341/60000/89326)
[INFO] [2022-12-29 11:34:55] Storing 10000 Traits (209243/70000/89326)
[INFO] [2022-12-29 11:34:58] Storing 19902 MetaTraits (209243/70000/89326)
[INFO] [2022-12-29 11:35:06] Storing 10000 Traits (239132/80000/89326)
[INFO] [2022-12-29 11:35:10] Storing 19889 MetaTraits (239132/80000/89326)
[INFO] [2022-12-29 11:35:17] Storing 9325 Traits (267020/89324/89326)
[INFO] [2022-12-29 11:35:20] Storing 18563 MetaTraits (267020/89324/89326)
[STOP] [2022-12-29 11:35:22] parse_diff_and_store
[START] [2022-12-29 11:35:22] resolve_keys
[2022-12-29 11:35:38] Resolving downloaded urls (this is not actually downloading them yet)
[INFO] [2022-12-29 11:36:51] Occurrences to nodes (through scientific_names)...
[INFO] [2022-12-29 11:36:52] traits to occurrences...
[INFO] [2022-12-29 11:36:57] traits to nodes (through occurrences)...
[INFO] [2022-12-29 11:37:00] Traits to sex term...
[INFO] [2022-12-29 11:37:01] Traits to lifestage term...
[INFO] [2022-12-29 11:37:03] MetaTraits to traits...
[INFO] [2022-12-29 11:37:07] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2022-12-29 11:37:07] Assocs to occurrences...
[INFO] [2022-12-29 11:37:07] Assocs to nodes...
[INFO] [2022-12-29 11:37:07] Assoc to sex term...
[INFO] [2022-12-29 11:37:07] Assoc to lifestage term...
[INFO] [2022-12-29 11:37:07] MetaAssoc to assocs...
[STOP] [2022-12-29 11:37:11] resolve_keys
[START] [2022-12-29 11:37:11] hold_for_later_1
[STOP] [2022-12-29 11:37:11] hold_for_later_1
[START] [2022-12-29 11:37:11] hold_for_later_2
[STOP] [2022-12-29 11:37:11] hold_for_later_2
[START] [2022-12-29 11:37:11] resolve_missing_parents
[STOP] [2022-12-29 11:37:11] resolve_missing_parents
[START] [2022-12-29 11:37:11] rebuild_nodes
[START] [2022-12-29 11:37:11] Flattener#flatten
[START] [2022-12-29 11:37:11] Flattener#study_resource
[START] [2022-12-29 11:37:12] Flattener#build_ancestry
[STOP] [2022-12-29 11:37:12] Flattener#build_ancestry
[INFO] [2022-12-29 11:37:12] 16474 ancestry keys
[START] [2022-12-29 11:37:12] build_node_ancestors
[INFO] [2022-12-29 11:37:12] old ancestors deleted.
[STOP] [2022-12-29 11:37:15] build_node_ancestors
[START] [2022-12-29 11:37:19] Flattener#propagate_ancestor_ids
[STOP] [2022-12-29 11:37:20] Flattener#propagate_ancestor_ids
[STOP] [2022-12-29 11:37:20] Flattener#flatten
[STOP] [2022-12-29 11:37:20] rebuild_nodes
[START] [2022-12-29 11:37:20] resolve_missing_media_owners
[STOP] [2022-12-29 11:37:20] resolve_missing_media_owners
[START] [2022-12-29 11:37:20] sanitize_media_verbatims
[STOP] [2022-12-29 11:37:20] sanitize_media_verbatims
[START] [2022-12-29 11:37:20] queue_downloads
[STOP] [2022-12-29 11:37:20] queue_downloads
[START] [2022-12-29 11:37:20] parse_names
[WARN] [2022-12-29 11:37:20] I see 16474 names which still need to be parsed.
[INFO] [2022-12-29 11:37:21] 0% of media downloaded
[WARN] [2022-12-29 11:37:21] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[INFO] [2022-12-29 11:37:21] 0% of media downloaded
[INFO] [2022-12-29 11:37:21] 0% of media downloaded
[INFO] [2022-12-29 11:37:23] 10% of media downloaded
[WARN] [2022-12-29 11:37:29] Names to parse: 6474 formatted: 6474 learned: 6474 parsed: 6474
[STOP] [2022-12-29 11:37:34] parse_names
[START] [2022-12-29 11:37:34] denormalize_canonical_names_to_nodes
[STOP] [2022-12-29 11:37:35] denormalize_canonical_names_to_nodes
[START] [2022-12-29 11:37:35] match_nodes
[START] [2022-12-29 11:37:35] map_all_nodes_to_pages
[STOP] [2022-12-29 11:42:38] map_all_nodes_to_pages
[INFO] [2022-12-29 11:42:40] 3836 Unmatched nodes (of 16474)! That's too many to output. Full list in /app/public/data/antweb/unmatched_nodes.txt ; First 10: Canonical: Acanthognathus laevigatus; Node#122377090; ResourceID: acanthognathus_laevigatus; Canonical: Acanthomyrmex glabfemoralis; Node#122377104; ResourceID: acanthomyrmex_glabfemoralis; Canonical: Acanthomyrmex humilis; Node#122377105; ResourceID: acanthomyrmex_humilis; Canonical: Acanthomyrmex malikuli; Node#122377108; ResourceID: acanthomyrmex_malikuli; Canonical: Acanthomyrmex minus; Node#122377110; ResourceID: acanthomyrmex_minus; Canonical: Acanthomyrmex mizunoi; Node#122377111; ResourceID: acanthomyrmex_mizunoi; Canonical: Acanthomyrmex padanensis; Node#122377113; ResourceID: acanthomyrmex_padanensis; Canonical: Acanthomyrmex sulawesiensis; Node#122377114; ResourceID: acanthomyrmex_sulawesiensis; Canonical: Acanthomyrmex thailandensis; Node#122377115; ResourceID: acanthomyrmex_thailandensis; Canonical: Acanthostichus arizonensis; Node#122377120; ResourceID: acanthostichus_arizonensis
[START] [2022-12-29 11:42:40] update_nodes
[STOP] [2022-12-29 11:42:51] update_nodes
[STOP] [2022-12-29 11:42:51] match_nodes
[START] [2022-12-29 11:42:51] reindex_search
[STOP] [2022-12-29 11:43:14] reindex_search
[START] [2022-12-29 11:43:14] normalize_units
[STOP] [2022-12-29 11:43:14] normalize_units
[START] [2022-12-29 11:43:14] calculate_statistics
[INFO] [2022-12-29 11:43:18] Duplicate page_id count: 0
[STOP] [2022-12-29 11:43:18] calculate_statistics
[START] [2022-12-29 11:43:18] complete_harvest_instance
[START] [2022-12-29 11:43:18] overall_tsv_creation
[INFO] [2022-12-29 11:43:18] Exporting 16474 nodes as TSV in batches of 10000...
[INFO] [2022-12-29 11:43:18] Processing group of 16474 in 2 batches of 10000
[ERR] [2022-12-29 11:43:58][hdls] download_and_prep FAILED for Medium.find(18669027): 308 Permanent Redirect
[ERR] [2022-12-29 11:43:59][hdls] download_and_prep FAILED for Medium.find(18669033): 308 Permanent Redirect
[ERR] [2022-12-29 11:44:00][hdls] download_and_prep FAILED for Medium.find(18669037): 308 Permanent Redirect
[ERR] [2022-12-29 11:45:34][hdls] download_and_prep FAILED for Medium.find(18686052): 308 Permanent Redirect
[ERR] [2022-12-29 11:45:37][hdls] download_and_prep FAILED for Medium.find(18686086): 308 Permanent Redirect
[ERR] [2022-12-29 11:45:39][hdls] download_and_prep FAILED for Medium.find(18686101): 308 Permanent Redirect
[INFO] [2022-12-29 11:49:49] 54284 Traits (unfiltered) and 0 associations...
[INFO] [2022-12-29 11:49:49] Building Traits map for 10000 nodes (this can take a while)...
[INFO] [2022-12-29 11:50:43] 70% of media downloaded
[INFO] [2022-12-29 11:51:00] 80% of media downloaded
[INFO] [2022-12-29 11:51:15] Mapped 54284 traits (108077 meta) for 10000 nodes.
[INFO] [2022-12-29 11:51:15] Building Associations map (this can take a while)...
[INFO] [2022-12-29 11:51:26] Done. 0 assocs mapped (0 meta).
[INFO] [2022-12-29 11:51:26] Adding 54284 traits...
[INFO] [2022-12-29 11:51:43] 0 metadata added.
[INFO] [2022-12-29 11:51:43] Adding 0 assocs...
[INFO] [2022-12-29 11:51:43] 0 metadata added.
[INFO] [2022-12-29 11:52:50] Processed 10000/16474 nodes
[INFO] [2022-12-29 11:55:05] 90% of media downloaded
[INFO] [2022-12-29 11:55:30] 100% of media downloaded
[ERR] [2022-12-29 11:55:30][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[INFO] [2022-12-29 11:55:30] 100% of media downloaded
[ERR] [2022-12-29 11:55:30][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[ERR] [2022-12-29 11:55:31][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[INFO] [2022-12-29 11:55:32] 100% of media downloaded
[ERR] [2022-12-29 11:55:32][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[INFO] [2022-12-29 11:55:34] 100% of media downloaded
[ERR] [2022-12-29 11:55:35][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[INFO] [2022-12-29 11:55:35] 100% of media downloaded
[ERR] [2022-12-29 11:55:36][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[INFO] [2022-12-29 11:56:41] 35040 Traits (unfiltered) and 0 associations...
[INFO] [2022-12-29 11:56:41] Building Traits map for 6474 nodes (this can take a while)...
[INFO] [2022-12-29 11:56:42] 100% of media downloaded
[ERR] [2022-12-29 11:56:42][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[INFO] [2022-12-29 11:57:20] Mapped 35040 traits (69619 meta) for 6474 nodes.
[INFO] [2022-12-29 11:57:20] Building Associations map (this can take a while)...
[INFO] [2022-12-29 11:57:29] Done. 0 assocs mapped (0 meta).
[INFO] [2022-12-29 11:57:29] Adding 35040 traits...
[INFO] [2022-12-29 11:57:40] 0 metadata added.
[INFO] [2022-12-29 11:57:40] Adding 0 assocs...
[INFO] [2022-12-29 11:57:40] 0 metadata added.
[INFO] [2022-12-29 11:58:10] 100% of media downloaded
[ERR] [2022-12-29 11:58:11][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[INFO] [2022-12-29 11:58:15] 100% of media downloaded
[ERR] [2022-12-29 11:58:16][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[ERR] [2022-12-29 11:58:17][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[INFO] [2022-12-29 11:58:22] 100% of media downloaded
[ERR] [2022-12-29 11:58:22][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[INFO] [2022-12-29 11:58:28] 100% of media downloaded
[ERR] [2022-12-29 11:58:28][hdls] NO additional images were found to download, NOTE THAT 6 DOWNLOADS FAILED.
[INFO] [2022-12-29 11:58:37] Processed 16474/16474 nodes
[INFO] [2022-12-29 11:58:37] Average Time: 407.87
[INFO] [2022-12-29 11:58:37] Total Time: 15m19s
[STOP] [2022-12-29 11:58:37] overall_tsv_creation
[INFO] [2022-12-29 11:58:37] Done. Check your files:
[INFO] [2022-12-29 11:58:37] (140143 lines) /app/public/data/antweb/publish_bibliographic_citations.tsv
[INFO] [2022-12-29 11:58:38] (16474 lines) /app/public/data/antweb/publish_nodes.tsv
[INFO] [2022-12-29 11:58:38] (65886 lines) /app/public/data/antweb/publish_node_ancestors.tsv
[INFO] [2022-12-29 11:58:38] (16474 lines) /app/public/data/antweb/publish_scientific_names.tsv
[INFO] [2022-12-29 11:58:38] (119510 lines) /app/public/data/antweb/publish_media.tsv
[INFO] [2022-12-29 11:58:39] (20634 lines) /app/public/data/antweb/publish_articles.tsv
[INFO] [2022-12-29 11:58:39] (55635 lines) /app/public/data/antweb/publish_image_info.tsv
[INFO] [2022-12-29 11:58:39] (119510 lines) /app/public/data/antweb/publish_attributions.tsv
[INFO] [2022-12-29 11:58:39] (20633 lines) /app/public/data/antweb/publish_content_sections.tsv
[INFO] [2022-12-29 11:58:39] (89325 lines) /app/public/data/antweb/publish_traits.tsv
[INFO] [2022-12-29 11:58:40] (1 lines) /app/public/data/antweb/publish_metadata.tsv
[STOP] [2022-12-29 11:58:40] complete_harvest_instance
[START] [2022-12-29 11:58:40] completed
[STOP] [2022-12-29 11:58:40] completed
[STOP] [2022-12-29 11:58:40] logged process, took 1815.47
[ERR] [2023-01-07 11:48:52][hdls] download_and_prep FAILED for Medium.find(18687959): 308 Permanent Redirect
[ERR] [2023-01-07 11:48:52][hdls] download_and_prep FAILED for Medium.find(18687960): 308 Permanent Redirect
[ERR] [2023-01-07 11:48:52][hdls] download_and_prep FAILED for Medium.find(18687961): 308 Permanent Redirect
[INFO] [2023-01-07 11:53:44] 100% of media downloaded
[ERR] [2023-01-07 11:53:45][hdls] NO additional images were found to download, NOTE THAT 9 DOWNLOADS FAILED.
Latest Process