Harvest for wikipedia EN Created 01 Jun 10:10

Stage: completed
Fetched: 01 Jun 10:10
Validated: 01 Jun 10:12
Deltas Created 01 Jun 10:13
Units Normalized: 01 Jun 14:11
Ancestry Built: 01 Jun 11:59
Nodes Matched: 01 Jun 13:55
Names Parsed: 01 Jun 12:05
New Models Stored: 01 Jun 11:20
Indexed: 01 Jun 14:11
Completed: 01 Jun 15:34
Time to Harvest: 5 minutes

Harvesting Log

(464 lines)
[INFO] [2022-06-01 10:10:27] Created harvest instance #4129
[STOP] [2022-06-01 10:10:27] create_harvest_instance
[START] [2022-06-01 10:10:27] fetch_files
[STOP] [2022-06-01 10:10:27] fetch_files
[START] [2022-06-01 10:10:27] validate_each_file
[INFO] [2022-06-01 10:10:27] Looping over 2 formats...
[INFO] [2022-06-01 10:10:27] ...nodes (/app/public/data/wiki_english/taxon.tab)
[INFO] [2022-06-01 10:10:42] Valid: /app/public/data/wiki_english/converted_csv/wiki_english_nodes_29350.csv (438686 lines)
[INFO] [2022-06-01 10:10:42] ...media (/app/public/data/wiki_english/media_resource.tab)
[INFO] [2022-06-01 10:12:20] Valid: /app/public/data/wiki_english/converted_csv/wiki_english_media_29349.csv (832790 lines)
[STOP] [2022-06-01 10:12:20] validate_each_file
[START] [2022-06-01 10:12:20] convert_to_csv
[INFO] [2022-06-01 10:12:20] Looping over 2 formats...
[INFO] [2022-06-01 10:12:20] ...nodes (/app/public/data/wiki_english/taxon.tab)
[CMD] [2022-06-01 10:12:20] /usr/bin/sort /app/public/data/wiki_english/converted_csv/wiki_english_nodes_29350.csv > /app/public/data/wiki_english/converted_csv/wiki_english_nodes_29350.csv_sorted
[INFO] [2022-06-01 10:12:22] Converted: /app/public/data/wiki_english/converted_csv/wiki_english_nodes_29350.csv (438686 lines)
[INFO] [2022-06-01 10:12:22] ...media (/app/public/data/wiki_english/media_resource.tab)
[CMD] [2022-06-01 10:12:22] /usr/bin/sort /app/public/data/wiki_english/converted_csv/wiki_english_media_29349.csv > /app/public/data/wiki_english/converted_csv/wiki_english_media_29349.csv_sorted
[INFO] [2022-06-01 10:13:16] Converted: /app/public/data/wiki_english/converted_csv/wiki_english_media_29349.csv (832790 lines)
[STOP] [2022-06-01 10:13:16] convert_to_csv
[START] [2022-06-01 10:13:16] calculate_delta
[INFO] [2022-06-01 10:13:16] Looping over 2 formats...
[INFO] [2022-06-01 10:13:16] ...nodes (/app/public/data/wiki_english/taxon.tab)
[CMD] [2022-06-01 10:13:16] echo "0a" > /app/public/data/wiki_english/diff/wiki_english_nodes_29350.diff
[CMD] [2022-06-01 10:13:17] tail -n +1 /app/public/data/wiki_english/converted_csv/wiki_english_nodes_29350.csv >> /app/public/data/wiki_english/diff/wiki_english_nodes_29350.diff
[CMD] [2022-06-01 10:13:17] echo "." >> /app/public/data/wiki_english/diff/wiki_english_nodes_29350.diff
[INFO] [2022-06-01 10:13:18] Created diff: /app/public/data/wiki_english/diff/wiki_english_nodes_29350.diff (438688 lines)
[INFO] [2022-06-01 10:13:18] ...media (/app/public/data/wiki_english/media_resource.tab)
[CMD] [2022-06-01 10:13:18] echo "0a" > /app/public/data/wiki_english/diff/wiki_english_media_29349.diff
[CMD] [2022-06-01 10:13:18] tail -n +1 /app/public/data/wiki_english/converted_csv/wiki_english_media_29349.csv >> /app/public/data/wiki_english/diff/wiki_english_media_29349.diff
[CMD] [2022-06-01 10:13:45] echo "." >> /app/public/data/wiki_english/diff/wiki_english_media_29349.diff
[INFO] [2022-06-01 10:13:50] Created diff: /app/public/data/wiki_english/diff/wiki_english_media_29349.diff (832792 lines)
[STOP] [2022-06-01 10:13:50] calculate_delta
[START] [2022-06-01 10:13:50] parse_diff_and_store
[INFO] [2022-06-01 10:13:50] Handling diff: /app/public/data/wiki_english/diff/wiki_english_nodes_29350.diff (438688 lines)
[INFO] [2022-06-01 10:13:51] Loading nodes diff file into memory (438688 lines)...
[INFO] [2022-06-01 10:13:54] Storing 9999 ScientificNames (29997/10000/438688)
[INFO] [2022-06-01 10:13:58] Storing 9999 Identifiers (29997/10000/438688)
[INFO] [2022-06-01 10:14:00] Storing 9999 Nodes (29997/10000/438688)
[INFO] [2022-06-01 10:14:08] Storing 10000 ScientificNames (59997/20000/438688)
[INFO] [2022-06-01 10:14:12] Storing 10000 Identifiers (59997/20000/438688)
[INFO] [2022-06-01 10:14:14] Storing 10000 Nodes (59997/20000/438688)
[INFO] [2022-06-01 10:14:22] Storing 10000 ScientificNames (89997/30000/438688)
[INFO] [2022-06-01 10:14:26] Storing 10000 Identifiers (89997/30000/438688)
[INFO] [2022-06-01 10:14:28] Storing 10000 Nodes (89997/30000/438688)
[WARN] [2022-06-01 10:14:35] Filtered Scientific Name `Cuon alpinus fumosus/javanicus` to `Cuon alpinus fumosusjavanicus`
[INFO] [2022-06-01 10:14:36] Storing 10000 ScientificNames (119997/40000/438688)
[INFO] [2022-06-01 10:14:40] Storing 10000 Identifiers (119997/40000/438688)
[INFO] [2022-06-01 10:14:42] Storing 10000 Nodes (119997/40000/438688)
[INFO] [2022-06-01 10:14:49] Storing 10000 ScientificNames (149997/50000/438688)
[INFO] [2022-06-01 10:14:53] Storing 10000 Identifiers (149997/50000/438688)
[INFO] [2022-06-01 10:14:55] Storing 10000 Nodes (149997/50000/438688)
[INFO] [2022-06-01 10:15:03] Storing 10000 ScientificNames (179997/60000/438688)
[INFO] [2022-06-01 10:15:06] Storing 10000 Identifiers (179997/60000/438688)
[INFO] [2022-06-01 10:15:08] Storing 10000 Nodes (179997/60000/438688)
[INFO] [2022-06-01 10:15:16] Storing 10000 ScientificNames (209997/70000/438688)
[INFO] [2022-06-01 10:15:19] Storing 10000 Identifiers (209997/70000/438688)
[INFO] [2022-06-01 10:15:21] Storing 10000 Nodes (209997/70000/438688)
[INFO] [2022-06-01 10:15:29] Storing 10000 ScientificNames (239997/80000/438688)
[INFO] [2022-06-01 10:15:32] Storing 10000 Identifiers (239997/80000/438688)
[INFO] [2022-06-01 10:15:34] Storing 10000 Nodes (239997/80000/438688)
[INFO] [2022-06-01 10:15:42] Storing 10000 ScientificNames (269998/90000/438688)
[INFO] [2022-06-01 10:15:45] Storing 10001 Identifiers (269998/90000/438688)
[INFO] [2022-06-01 10:15:47] Storing 10000 Nodes (269998/90000/438688)
[INFO] [2022-06-01 10:15:55] Storing 10000 ScientificNames (299998/100000/438688)
[INFO] [2022-06-01 10:15:58] Storing 10000 Identifiers (299998/100000/438688)
[INFO] [2022-06-01 10:16:00] Storing 10000 Nodes (299998/100000/438688)
[INFO] [2022-06-01 10:16:08] Storing 10000 ScientificNames (329998/110000/438688)
[INFO] [2022-06-01 10:16:12] Storing 10000 Identifiers (329998/110000/438688)
[INFO] [2022-06-01 10:16:13] Storing 10000 Nodes (329998/110000/438688)
[INFO] [2022-06-01 10:16:22] Storing 10000 ScientificNames (359998/120000/438688)
[INFO] [2022-06-01 10:16:25] Storing 10000 Identifiers (359998/120000/438688)
[INFO] [2022-06-01 10:16:27] Storing 10000 Nodes (359998/120000/438688)
[INFO] [2022-06-01 10:16:35] Storing 10000 ScientificNames (389998/130000/438688)
[INFO] [2022-06-01 10:16:38] Storing 10000 Identifiers (389998/130000/438688)
[INFO] [2022-06-01 10:16:40] Storing 10000 Nodes (389998/130000/438688)
[INFO] [2022-06-01 10:16:48] Storing 10000 ScientificNames (419998/140000/438688)
[INFO] [2022-06-01 10:16:51] Storing 10000 Identifiers (419998/140000/438688)
[INFO] [2022-06-01 10:16:53] Storing 10000 Nodes (419998/140000/438688)
[INFO] [2022-06-01 10:17:01] Storing 10000 ScientificNames (449998/150000/438688)
[INFO] [2022-06-01 10:17:04] Storing 10000 Identifiers (449998/150000/438688)
[INFO] [2022-06-01 10:17:06] Storing 10000 Nodes (449998/150000/438688)
[INFO] [2022-06-01 10:17:14] Storing 10000 ScientificNames (479998/160000/438688)
[INFO] [2022-06-01 10:17:18] Storing 10000 Identifiers (479998/160000/438688)
[INFO] [2022-06-01 10:17:19] Storing 10000 Nodes (479998/160000/438688)
[INFO] [2022-06-01 10:17:27] Storing 10000 ScientificNames (509998/170000/438688)
[INFO] [2022-06-01 10:17:31] Storing 10000 Identifiers (509998/170000/438688)
[INFO] [2022-06-01 10:17:33] Storing 10000 Nodes (509998/170000/438688)
[INFO] [2022-06-01 10:17:41] Storing 10000 ScientificNames (539998/180000/438688)
[INFO] [2022-06-01 10:17:45] Storing 10000 Identifiers (539998/180000/438688)
[INFO] [2022-06-01 10:17:47] Storing 10000 Nodes (539998/180000/438688)
[INFO] [2022-06-01 10:17:55] Storing 10000 ScientificNames (569999/190000/438688)
[INFO] [2022-06-01 10:17:58] Storing 10001 Identifiers (569999/190000/438688)
[INFO] [2022-06-01 10:18:00] Storing 10000 Nodes (569999/190000/438688)
[INFO] [2022-06-01 10:18:08] Storing 10000 ScientificNames (599999/200000/438688)
[INFO] [2022-06-01 10:18:11] Storing 10000 Identifiers (599999/200000/438688)
[INFO] [2022-06-01 10:18:13] Storing 10000 Nodes (599999/200000/438688)
[INFO] [2022-06-01 10:18:22] Storing 10000 ScientificNames (629999/210000/438688)
[INFO] [2022-06-01 10:18:25] Storing 10000 Identifiers (629999/210000/438688)
[INFO] [2022-06-01 10:18:27] Storing 10000 Nodes (629999/210000/438688)
[INFO] [2022-06-01 10:18:35] Storing 10000 ScientificNames (659999/220000/438688)
[INFO] [2022-06-01 10:18:39] Storing 10000 Identifiers (659999/220000/438688)
[INFO] [2022-06-01 10:18:41] Storing 10000 Nodes (659999/220000/438688)
[INFO] [2022-06-01 10:18:50] Storing 10000 ScientificNames (689999/230000/438688)
[INFO] [2022-06-01 10:18:54] Storing 10000 Identifiers (689999/230000/438688)
[INFO] [2022-06-01 10:18:55] Storing 10000 Nodes (689999/230000/438688)
[INFO] [2022-06-01 10:19:04] Storing 10000 ScientificNames (719999/240000/438688)
[INFO] [2022-06-01 10:19:08] Storing 10000 Identifiers (719999/240000/438688)
[INFO] [2022-06-01 10:19:09] Storing 10000 Nodes (719999/240000/438688)
[INFO] [2022-06-01 10:19:18] Storing 10000 ScientificNames (749999/250000/438688)
[INFO] [2022-06-01 10:19:22] Storing 10000 Identifiers (749999/250000/438688)
[INFO] [2022-06-01 10:19:23] Storing 10000 Nodes (749999/250000/438688)
[INFO] [2022-06-01 10:19:32] Storing 10000 ScientificNames (779999/260000/438688)
[INFO] [2022-06-01 10:19:36] Storing 10000 Identifiers (779999/260000/438688)
[INFO] [2022-06-01 10:19:37] Storing 10000 Nodes (779999/260000/438688)
[INFO] [2022-06-01 10:19:46] Storing 10000 ScientificNames (809999/270000/438688)
[INFO] [2022-06-01 10:19:50] Storing 10000 Identifiers (809999/270000/438688)
[INFO] [2022-06-01 10:19:52] Storing 10000 Nodes (809999/270000/438688)
[INFO] [2022-06-01 10:20:01] Storing 10000 ScientificNames (839999/280000/438688)
[INFO] [2022-06-01 10:20:05] Storing 10000 Identifiers (839999/280000/438688)
[INFO] [2022-06-01 10:20:06] Storing 10000 Nodes (839999/280000/438688)
[INFO] [2022-06-01 10:20:15] Storing 10000 ScientificNames (869999/290000/438688)
[INFO] [2022-06-01 10:20:19] Storing 10000 Identifiers (869999/290000/438688)
[INFO] [2022-06-01 10:20:20] Storing 10000 Nodes (869999/290000/438688)
[INFO] [2022-06-01 10:20:29] Storing 10000 ScientificNames (900000/300000/438688)
[INFO] [2022-06-01 10:20:33] Storing 10001 Identifiers (900000/300000/438688)
[INFO] [2022-06-01 10:20:34] Storing 10000 Nodes (900000/300000/438688)
[INFO] [2022-06-01 10:20:43] Storing 10000 ScientificNames (930000/310000/438688)
[INFO] [2022-06-01 10:20:47] Storing 10000 Identifiers (930000/310000/438688)
[INFO] [2022-06-01 10:20:49] Storing 10000 Nodes (930000/310000/438688)
[INFO] [2022-06-01 10:20:58] Storing 10000 ScientificNames (960000/320000/438688)
[INFO] [2022-06-01 10:21:02] Storing 10000 Identifiers (960000/320000/438688)
[INFO] [2022-06-01 10:21:04] Storing 10000 Nodes (960000/320000/438688)
[INFO] [2022-06-01 10:21:13] Storing 10000 ScientificNames (990000/330000/438688)
[INFO] [2022-06-01 10:21:17] Storing 10000 Identifiers (990000/330000/438688)
[INFO] [2022-06-01 10:21:19] Storing 10000 Nodes (990000/330000/438688)
[INFO] [2022-06-01 10:21:28] Storing 10000 ScientificNames (1020000/340000/438688)
[INFO] [2022-06-01 10:21:32] Storing 10000 Identifiers (1020000/340000/438688)
[INFO] [2022-06-01 10:21:34] Storing 10000 Nodes (1020000/340000/438688)
[INFO] [2022-06-01 10:21:43] Storing 10000 ScientificNames (1050000/350000/438688)
[INFO] [2022-06-01 10:21:48] Storing 10000 Identifiers (1050000/350000/438688)
[INFO] [2022-06-01 10:21:49] Storing 10000 Nodes (1050000/350000/438688)
[INFO] [2022-06-01 10:21:58] Storing 10000 ScientificNames (1080000/360000/438688)
[INFO] [2022-06-01 10:22:02] Storing 10000 Identifiers (1080000/360000/438688)
[INFO] [2022-06-01 10:22:04] Storing 10000 Nodes (1080000/360000/438688)
[INFO] [2022-06-01 10:22:13] Storing 10000 ScientificNames (1110000/370000/438688)
[INFO] [2022-06-01 10:22:17] Storing 10000 Identifiers (1110000/370000/438688)
[INFO] [2022-06-01 10:22:19] Storing 10000 Nodes (1110000/370000/438688)
[WARN] [2022-06-01 10:22:23] Filtered Scientific Name `Homalocephala  polycephala` to `Homalocephala polycephala`
[INFO] [2022-06-01 10:22:28] Storing 10000 ScientificNames (1140000/380000/438688)
[INFO] [2022-06-01 10:22:32] Storing 10000 Identifiers (1140000/380000/438688)
[INFO] [2022-06-01 10:22:34] Storing 10000 Nodes (1140000/380000/438688)
[INFO] [2022-06-01 10:22:44] Storing 10000 ScientificNames (1170000/390000/438688)
[INFO] [2022-06-01 10:22:48] Storing 10000 Identifiers (1170000/390000/438688)
[INFO] [2022-06-01 10:22:50] Storing 10000 Nodes (1170000/390000/438688)
[INFO] [2022-06-01 10:23:00] Storing 10000 ScientificNames (1200000/400000/438688)
[INFO] [2022-06-01 10:23:04] Storing 10000 Identifiers (1200000/400000/438688)
[INFO] [2022-06-01 10:23:06] Storing 10000 Nodes (1200000/400000/438688)
[INFO] [2022-06-01 10:23:15] Storing 10000 ScientificNames (1230001/410000/438688)
[INFO] [2022-06-01 10:23:20] Storing 10001 Identifiers (1230001/410000/438688)
[INFO] [2022-06-01 10:23:21] Storing 10000 Nodes (1230001/410000/438688)
[INFO] [2022-06-01 10:23:31] Storing 10000 ScientificNames (1260002/420000/438688)
[INFO] [2022-06-01 10:23:35] Storing 10001 Identifiers (1260002/420000/438688)
[INFO] [2022-06-01 10:23:37] Storing 10000 Nodes (1260002/420000/438688)
[INFO] [2022-06-01 10:23:47] Storing 10000 ScientificNames (1290002/430000/438688)
[INFO] [2022-06-01 10:23:51] Storing 10000 Identifiers (1290002/430000/438688)
[INFO] [2022-06-01 10:23:53] Storing 10000 Nodes (1290002/430000/438688)
[INFO] [2022-06-01 10:24:02] Storing 8687 ScientificNames (1316063/438686/438688)
[INFO] [2022-06-01 10:24:06] Storing 8687 Identifiers (1316063/438686/438688)
[INFO] [2022-06-01 10:24:07] Storing 8687 Nodes (1316063/438686/438688)
[INFO] [2022-06-01 10:24:15] Handling diff: /app/public/data/wiki_english/diff/wiki_english_media_29349.diff (832792 lines)
[INFO] [2022-06-01 10:24:15] Loading media diff file into memory (832792 lines)...
[INFO] [2022-06-01 10:24:50] Storing 9999 ArticlesSections (19998/10000/832792)
[INFO] [2022-06-01 10:24:50] Storing 9999 Articles (19998/10000/832792)
[INFO] [2022-06-01 10:25:33] Storing 10000 ArticlesSections (39998/20000/832792)
[INFO] [2022-06-01 10:25:34] Storing 10000 Articles (39998/20000/832792)
[INFO] [2022-06-01 10:26:13] Storing 10000 ArticlesSections (59998/30000/832792)
[INFO] [2022-06-01 10:26:14] Storing 10000 Articles (59998/30000/832792)
[INFO] [2022-06-01 10:26:52] Storing 10000 ArticlesSections (79998/40000/832792)
[INFO] [2022-06-01 10:26:52] Storing 10000 Articles (79998/40000/832792)
[INFO] [2022-06-01 10:27:31] Storing 10000 ArticlesSections (99998/50000/832792)
[INFO] [2022-06-01 10:27:31] Storing 10000 Articles (99998/50000/832792)
[INFO] [2022-06-01 10:28:10] Storing 10000 ArticlesSections (119998/60000/832792)
[INFO] [2022-06-01 10:28:11] Storing 10000 Articles (119998/60000/832792)
[INFO] [2022-06-01 10:28:48] Storing 10000 ArticlesSections (139998/70000/832792)
[INFO] [2022-06-01 10:28:49] Storing 10000 Articles (139998/70000/832792)
[INFO] [2022-06-01 10:29:28] Storing 10000 ArticlesSections (159998/80000/832792)
[INFO] [2022-06-01 10:29:29] Storing 10000 Articles (159998/80000/832792)
[INFO] [2022-06-01 10:30:06] Storing 10000 ArticlesSections (179998/90000/832792)
[INFO] [2022-06-01 10:30:07] Storing 10000 Articles (179998/90000/832792)
[INFO] [2022-06-01 10:30:46] Storing 10000 ArticlesSections (199998/100000/832792)
[INFO] [2022-06-01 10:30:47] Storing 10000 Articles (199998/100000/832792)
[INFO] [2022-06-01 10:31:26] Storing 10000 ArticlesSections (219998/110000/832792)
[INFO] [2022-06-01 10:31:26] Storing 10000 Articles (219998/110000/832792)
[INFO] [2022-06-01 10:32:06] Storing 10000 ArticlesSections (239998/120000/832792)
[INFO] [2022-06-01 10:32:07] Storing 10000 Articles (239998/120000/832792)
[INFO] [2022-06-01 10:32:45] Storing 10000 ArticlesSections (259998/130000/832792)
[INFO] [2022-06-01 10:32:45] Storing 10000 Articles (259998/130000/832792)
[INFO] [2022-06-01 10:33:23] Storing 10000 ArticlesSections (279998/140000/832792)
[INFO] [2022-06-01 10:33:23] Storing 10000 Articles (279998/140000/832792)
[INFO] [2022-06-01 10:34:01] Storing 10000 ArticlesSections (299998/150000/832792)
[INFO] [2022-06-01 10:34:01] Storing 10000 Articles (299998/150000/832792)
[INFO] [2022-06-01 10:34:40] Storing 10000 ArticlesSections (319998/160000/832792)
[INFO] [2022-06-01 10:34:41] Storing 10000 Articles (319998/160000/832792)
[INFO] [2022-06-01 10:35:20] Storing 10000 ArticlesSections (339998/170000/832792)
[INFO] [2022-06-01 10:35:20] Storing 10000 Articles (339998/170000/832792)
[INFO] [2022-06-01 10:35:59] Storing 10000 ArticlesSections (359998/180000/832792)
[INFO] [2022-06-01 10:36:00] Storing 10000 Articles (359998/180000/832792)
[INFO] [2022-06-01 10:36:39] Storing 10000 ArticlesSections (379998/190000/832792)
[INFO] [2022-06-01 10:36:39] Storing 10000 Articles (379998/190000/832792)
[INFO] [2022-06-01 10:37:18] Storing 10000 ArticlesSections (399998/200000/832792)
[INFO] [2022-06-01 10:37:18] Storing 10000 Articles (399998/200000/832792)
[INFO] [2022-06-01 10:37:56] Storing 10000 ArticlesSections (419998/210000/832792)
[INFO] [2022-06-01 10:37:57] Storing 10000 Articles (419998/210000/832792)
[INFO] [2022-06-01 10:38:36] Storing 10000 ArticlesSections (439998/220000/832792)
[INFO] [2022-06-01 10:38:36] Storing 10000 Articles (439998/220000/832792)
[INFO] [2022-06-01 10:39:15] Storing 10000 ArticlesSections (459998/230000/832792)
[INFO] [2022-06-01 10:39:16] Storing 10000 Articles (459998/230000/832792)
[INFO] [2022-06-01 10:39:54] Storing 10000 ArticlesSections (479998/240000/832792)
[INFO] [2022-06-01 10:39:55] Storing 10000 Articles (479998/240000/832792)
[INFO] [2022-06-01 10:40:35] Storing 10000 ArticlesSections (499998/250000/832792)
[INFO] [2022-06-01 10:40:36] Storing 10000 Articles (499998/250000/832792)
[INFO] [2022-06-01 10:41:14] Storing 10000 ArticlesSections (519998/260000/832792)
[INFO] [2022-06-01 10:41:14] Storing 10000 Articles (519998/260000/832792)
[INFO] [2022-06-01 10:41:54] Storing 10000 ArticlesSections (539998/270000/832792)
[INFO] [2022-06-01 10:41:55] Storing 10000 Articles (539998/270000/832792)
[INFO] [2022-06-01 10:42:34] Storing 10000 ArticlesSections (559998/280000/832792)
[INFO] [2022-06-01 10:42:35] Storing 10000 Articles (559998/280000/832792)
[INFO] [2022-06-01 10:43:15] Storing 10000 ArticlesSections (579998/290000/832792)
[INFO] [2022-06-01 10:43:16] Storing 10000 Articles (579998/290000/832792)
[INFO] [2022-06-01 10:43:55] Storing 10000 ArticlesSections (599998/300000/832792)
[INFO] [2022-06-01 10:43:56] Storing 10000 Articles (599998/300000/832792)
[INFO] [2022-06-01 10:44:37] Storing 10000 ArticlesSections (619998/310000/832792)
[INFO] [2022-06-01 10:44:38] Storing 10000 Articles (619998/310000/832792)
[INFO] [2022-06-01 10:45:18] Storing 10000 ArticlesSections (639998/320000/832792)
[INFO] [2022-06-01 10:45:19] Storing 10000 Articles (639998/320000/832792)
[INFO] [2022-06-01 10:45:58] Storing 10000 ArticlesSections (659998/330000/832792)
[INFO] [2022-06-01 10:45:59] Storing 10000 Articles (659998/330000/832792)
[INFO] [2022-06-01 10:46:38] Storing 10000 ArticlesSections (679998/340000/832792)
[INFO] [2022-06-01 10:46:39] Storing 10000 Articles (679998/340000/832792)
[INFO] [2022-06-01 10:47:20] Storing 10000 ArticlesSections (699998/350000/832792)
[INFO] [2022-06-01 10:47:20] Storing 10000 Articles (699998/350000/832792)
[INFO] [2022-06-01 10:48:02] Storing 10000 ArticlesSections (719998/360000/832792)
[INFO] [2022-06-01 10:48:02] Storing 10000 Articles (719998/360000/832792)
[INFO] [2022-06-01 10:48:43] Storing 10000 ArticlesSections (739998/370000/832792)
[INFO] [2022-06-01 10:48:44] Storing 10000 Articles (739998/370000/832792)
[INFO] [2022-06-01 10:49:23] Storing 10000 ArticlesSections (759998/380000/832792)
[INFO] [2022-06-01 10:49:23] Storing 10000 Articles (759998/380000/832792)
[INFO] [2022-06-01 10:50:03] Storing 10000 ArticlesSections (779998/390000/832792)
[INFO] [2022-06-01 10:50:04] Storing 10000 Articles (779998/390000/832792)
[INFO] [2022-06-01 10:50:42] Storing 10000 ArticlesSections (799998/400000/832792)
[INFO] [2022-06-01 10:50:43] Storing 10000 Articles (799998/400000/832792)
[INFO] [2022-06-01 10:51:23] Storing 10000 ArticlesSections (819998/410000/832792)
[INFO] [2022-06-01 10:51:23] Storing 10000 Articles (819998/410000/832792)
[INFO] [2022-06-01 10:52:03] Storing 10000 ArticlesSections (839998/420000/832792)
[INFO] [2022-06-01 10:52:03] Storing 10000 Articles (839998/420000/832792)
[INFO] [2022-06-01 10:52:44] Storing 10000 ArticlesSections (859998/430000/832792)
[INFO] [2022-06-01 10:52:45] Storing 10000 Articles (859998/430000/832792)
[INFO] [2022-06-01 10:53:25] Storing 10000 ArticlesSections (879998/440000/832792)
[INFO] [2022-06-01 10:53:25] Storing 10000 Articles (879998/440000/832792)
[INFO] [2022-06-01 10:54:03] Storing 10000 ArticlesSections (899998/450000/832792)
[INFO] [2022-06-01 10:54:04] Storing 10000 Articles (899998/450000/832792)
[INFO] [2022-06-01 10:54:43] Storing 10000 ArticlesSections (919998/460000/832792)
[INFO] [2022-06-01 10:54:44] Storing 10000 Articles (919998/460000/832792)
[INFO] [2022-06-01 10:55:24] Storing 10000 ArticlesSections (939998/470000/832792)
[INFO] [2022-06-01 10:55:25] Storing 10000 Articles (939998/470000/832792)
[INFO] [2022-06-01 10:56:03] Storing 10000 ArticlesSections (959998/480000/832792)
[INFO] [2022-06-01 10:56:04] Storing 10000 Articles (959998/480000/832792)
[INFO] [2022-06-01 10:56:43] Storing 10000 ArticlesSections (979998/490000/832792)
[INFO] [2022-06-01 10:56:43] Storing 10000 Articles (979998/490000/832792)
[INFO] [2022-06-01 10:57:25] Storing 10000 ArticlesSections (999998/500000/832792)
[INFO] [2022-06-01 10:57:26] Storing 10000 Articles (999998/500000/832792)
[INFO] [2022-06-01 10:58:06] Storing 10000 ArticlesSections (1019998/510000/832792)
[INFO] [2022-06-01 10:58:06] Storing 10000 Articles (1019998/510000/832792)
[INFO] [2022-06-01 10:58:47] Storing 10000 ArticlesSections (1039998/520000/832792)
[INFO] [2022-06-01 10:58:47] Storing 10000 Articles (1039998/520000/832792)
[INFO] [2022-06-01 10:59:28] Storing 10000 ArticlesSections (1059998/530000/832792)
[INFO] [2022-06-01 10:59:28] Storing 10000 Articles (1059998/530000/832792)
[INFO] [2022-06-01 11:00:09] Storing 10000 ArticlesSections (1079998/540000/832792)
[INFO] [2022-06-01 11:00:10] Storing 10000 Articles (1079998/540000/832792)
[INFO] [2022-06-01 11:00:52] Storing 10000 ArticlesSections (1099998/550000/832792)
[INFO] [2022-06-01 11:00:52] Storing 10000 Articles (1099998/550000/832792)
[INFO] [2022-06-01 11:01:33] Storing 10000 ArticlesSections (1119998/560000/832792)
[INFO] [2022-06-01 11:01:33] Storing 10000 Articles (1119998/560000/832792)
[INFO] [2022-06-01 11:02:14] Storing 10000 ArticlesSections (1139998/570000/832792)
[INFO] [2022-06-01 11:02:15] Storing 10000 Articles (1139998/570000/832792)
[INFO] [2022-06-01 11:02:56] Storing 10000 ArticlesSections (1159998/580000/832792)
[INFO] [2022-06-01 11:02:56] Storing 10000 Articles (1159998/580000/832792)
[INFO] [2022-06-01 11:03:37] Storing 10000 ArticlesSections (1179998/590000/832792)
[INFO] [2022-06-01 11:03:38] Storing 10000 Articles (1179998/590000/832792)
[INFO] [2022-06-01 11:04:19] Storing 10000 ArticlesSections (1199998/600000/832792)
[INFO] [2022-06-01 11:04:19] Storing 10000 Articles (1199998/600000/832792)
[INFO] [2022-06-01 11:05:00] Storing 10000 ArticlesSections (1219998/610000/832792)
[INFO] [2022-06-01 11:05:01] Storing 10000 Articles (1219998/610000/832792)
[INFO] [2022-06-01 11:05:43] Storing 10000 ArticlesSections (1239998/620000/832792)
[INFO] [2022-06-01 11:05:43] Storing 10000 Articles (1239998/620000/832792)
[INFO] [2022-06-01 11:06:24] Storing 10000 ArticlesSections (1259998/630000/832792)
[INFO] [2022-06-01 11:06:24] Storing 10000 Articles (1259998/630000/832792)
[INFO] [2022-06-01 11:07:05] Storing 10000 ArticlesSections (1279998/640000/832792)
[INFO] [2022-06-01 11:07:06] Storing 10000 Articles (1279998/640000/832792)
[INFO] [2022-06-01 11:07:47] Storing 10000 ArticlesSections (1299998/650000/832792)
[INFO] [2022-06-01 11:07:47] Storing 10000 Articles (1299998/650000/832792)
[INFO] [2022-06-01 11:08:29] Storing 10000 ArticlesSections (1319998/660000/832792)
[INFO] [2022-06-01 11:08:29] Storing 10000 Articles (1319998/660000/832792)
[INFO] [2022-06-01 11:09:09] Storing 10000 ArticlesSections (1339998/670000/832792)
[INFO] [2022-06-01 11:09:09] Storing 10000 Articles (1339998/670000/832792)
[INFO] [2022-06-01 11:09:50] Storing 10000 ArticlesSections (1359998/680000/832792)
[INFO] [2022-06-01 11:09:50] Storing 10000 Articles (1359998/680000/832792)
[INFO] [2022-06-01 11:10:32] Storing 10000 ArticlesSections (1379998/690000/832792)
[INFO] [2022-06-01 11:10:33] Storing 10000 Articles (1379998/690000/832792)
[INFO] [2022-06-01 11:11:14] Storing 10000 ArticlesSections (1399998/700000/832792)
[INFO] [2022-06-01 11:11:14] Storing 10000 Articles (1399998/700000/832792)
[INFO] [2022-06-01 11:11:55] Storing 10000 ArticlesSections (1419998/710000/832792)
[INFO] [2022-06-01 11:11:55] Storing 10000 Articles (1419998/710000/832792)
[INFO] [2022-06-01 11:12:38] Storing 10000 ArticlesSections (1439998/720000/832792)
[INFO] [2022-06-01 11:12:38] Storing 10000 Articles (1439998/720000/832792)
[INFO] [2022-06-01 11:13:19] Storing 10000 ArticlesSections (1459998/730000/832792)
[INFO] [2022-06-01 11:13:20] Storing 10000 Articles (1459998/730000/832792)
[INFO] [2022-06-01 11:14:03] Storing 10000 ArticlesSections (1479998/740000/832792)
[INFO] [2022-06-01 11:14:03] Storing 10000 Articles (1479998/740000/832792)
[INFO] [2022-06-01 11:14:45] Storing 10000 ArticlesSections (1499998/750000/832792)
[INFO] [2022-06-01 11:14:46] Storing 10000 Articles (1499998/750000/832792)
[INFO] [2022-06-01 11:15:28] Storing 10000 ArticlesSections (1519998/760000/832792)
[INFO] [2022-06-01 11:15:29] Storing 10000 Articles (1519998/760000/832792)
[INFO] [2022-06-01 11:16:10] Storing 10000 ArticlesSections (1539998/770000/832792)
[INFO] [2022-06-01 11:16:10] Storing 10000 Articles (1539998/770000/832792)
[INFO] [2022-06-01 11:16:53] Storing 10000 ArticlesSections (1559998/780000/832792)
[INFO] [2022-06-01 11:16:53] Storing 10000 Articles (1559998/780000/832792)
[INFO] [2022-06-01 11:17:34] Storing 10000 ArticlesSections (1579998/790000/832792)
[INFO] [2022-06-01 11:17:35] Storing 10000 Articles (1579998/790000/832792)
[INFO] [2022-06-01 11:18:17] Storing 10000 ArticlesSections (1599998/800000/832792)
[INFO] [2022-06-01 11:18:17] Storing 10000 Articles (1599998/800000/832792)
[INFO] [2022-06-01 11:18:59] Storing 10000 ArticlesSections (1619998/810000/832792)
[INFO] [2022-06-01 11:18:59] Storing 10000 Articles (1619998/810000/832792)
[INFO] [2022-06-01 11:19:41] Storing 10000 ArticlesSections (1639998/820000/832792)
[INFO] [2022-06-01 11:19:41] Storing 10000 Articles (1639998/820000/832792)
[INFO] [2022-06-01 11:20:23] Storing 10000 ArticlesSections (1659998/830000/832792)
[INFO] [2022-06-01 11:20:24] Storing 10000 Articles (1659998/830000/832792)
[INFO] [2022-06-01 11:20:42] Storing 2791 ArticlesSections (1665580/832790/832792)
[INFO] [2022-06-01 11:20:42] Storing 2791 Articles (1665580/832790/832792)
[STOP] [2022-06-01 11:20:43] parse_diff_and_store
[START] [2022-06-01 11:20:43] resolve_keys
[2022-06-01 11:23:04] Resolving downloaded urls (this is not actually downloading them yet)
[INFO] [2022-06-01 11:30:22] Occurrences to nodes (through scientific_names)...
[INFO] [2022-06-01 11:30:22] traits to occurrences...
[INFO] [2022-06-01 11:30:22] traits to nodes (through occurrences)...
[INFO] [2022-06-01 11:30:22] Traits to sex term...
[INFO] [2022-06-01 11:30:22] Traits to lifestage term...
[INFO] [2022-06-01 11:30:22] MetaTraits to traits...
[INFO] [2022-06-01 11:30:22] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2022-06-01 11:30:22] Assocs to occurrences...
[INFO] [2022-06-01 11:30:22] Assocs to nodes...
[INFO] [2022-06-01 11:30:22] Assoc to sex term...
[INFO] [2022-06-01 11:30:22] Assoc to lifestage term...
[INFO] [2022-06-01 11:30:22] MetaAssoc to assocs...
[STOP] [2022-06-01 11:30:22] resolve_keys
[START] [2022-06-01 11:30:22] hold_for_later_1
[STOP] [2022-06-01 11:30:22] hold_for_later_1
[START] [2022-06-01 11:30:22] hold_for_later_2
[STOP] [2022-06-01 11:30:22] hold_for_later_2
[START] [2022-06-01 11:30:22] resolve_missing_parents
[STOP] [2022-06-01 11:30:46] resolve_missing_parents
[START] [2022-06-01 11:30:46] rebuild_nodes
[START] [2022-06-01 11:30:46] Flattener#flatten
[START] [2022-06-01 11:30:46] Flattener#study_resource
[START] [2022-06-01 11:31:31] Flattener#build_ancestry
[STOP] [2022-06-01 11:36:56] Flattener#build_ancestry
[INFO] [2022-06-01 11:36:56] 438686 ancestry keys
[START] [2022-06-01 11:36:56] build_node_ancestors
[INFO] [2022-06-01 11:36:56] old ancestors deleted.
[STOP] [2022-06-01 11:54:36] build_node_ancestors
[START] [2022-06-01 11:54:38] Flattener#propagate_ancestor_ids
[STOP] [2022-06-01 11:59:40] Flattener#propagate_ancestor_ids
[STOP] [2022-06-01 11:59:40] Flattener#flatten
[STOP] [2022-06-01 11:59:40] rebuild_nodes
[START] [2022-06-01 11:59:40] resolve_missing_media_owners
[STOP] [2022-06-01 11:59:40] resolve_missing_media_owners
[START] [2022-06-01 11:59:40] sanitize_media_verbatims
[STOP] [2022-06-01 11:59:40] sanitize_media_verbatims
[START] [2022-06-01 11:59:40] queue_downloads
[STOP] [2022-06-01 11:59:40] queue_downloads
[START] [2022-06-01 11:59:40] parse_names
[WARN] [2022-06-01 11:59:41] I see 438686 names which still need to be parsed.
[WARN] [2022-06-01 11:59:42] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 11:59:50] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 11:59:59] Names to parse: 10000 formatted: 10000 learned: 9998 parsed: 10000
[WARN] [2022-06-01 12:00:07] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:00:15] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:00:23] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:00:30] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:00:38] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:00:46] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:00:54] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 12:01:01] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:01:09] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:01:17] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 12:01:24] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:01:32] Names to parse: 10000 formatted: 10000 learned: 9998 parsed: 10000
[WARN] [2022-06-01 12:01:40] Names to parse: 10000 formatted: 10000 learned: 9998 parsed: 10000
[WARN] [2022-06-01 12:01:47] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:01:55] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:02:02] Names to parse: 10000 formatted: 10000 learned: 9998 parsed: 10000
[WARN] [2022-06-01 12:02:09] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:02:17] Names to parse: 10000 formatted: 10000 learned: 10000 parsed: 10000
[WARN] [2022-06-01 12:02:25] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 12:02:32] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 12:02:39] Names to parse: 10000 formatted: 10000 learned: 9998 parsed: 10000
[WARN] [2022-06-01 12:02:47] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 12:02:54] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 12:03:02] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 12:03:09] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 12:03:16] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 12:03:24] Names to parse: 10000 formatted: 10000 learned: 9994 parsed: 10000
[WARN] [2022-06-01 12:03:32] Names to parse: 10000 formatted: 10000 learned: 9993 parsed: 10000
[WARN] [2022-06-01 12:03:39] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 12:03:46] Names to parse: 10000 formatted: 10000 learned: 9998 parsed: 10000
[WARN] [2022-06-01 12:03:54] Names to parse: 10000 formatted: 10000 learned: 9996 parsed: 10000
[WARN] [2022-06-01 12:04:01] Names to parse: 10000 formatted: 10000 learned: 9999 parsed: 10000
[WARN] [2022-06-01 12:04:08] Names to parse: 10000 formatted: 10000 learned: 9994 parsed: 10000
[WARN] [2022-06-01 12:04:16] Names to parse: 10000 formatted: 10000 learned: 9998 parsed: 10000
[WARN] [2022-06-01 12:04:22] Names to parse: 10000 formatted: 10000 learned: 9998 parsed: 10000
[WARN] [2022-06-01 12:04:30] Names to parse: 10000 formatted: 10000 learned: 9994 parsed: 10000
[WARN] [2022-06-01 12:04:37] Names to parse: 10000 formatted: 10000 learned: 9995 parsed: 10000
[WARN] [2022-06-01 12:04:44] Names to parse: 10000 formatted: 10000 learned: 9997 parsed: 10000
[WARN] [2022-06-01 12:04:51] Names to parse: 10000 formatted: 10000 learned: 9993 parsed: 10000
[WARN] [2022-06-01 12:04:59] Names to parse: 10000 formatted: 10000 learned: 9998 parsed: 10000
[WARN] [2022-06-01 12:05:06] Names to parse: 8686 formatted: 8686 learned: 8684 parsed: 8686
[STOP] [2022-06-01 12:05:13] parse_names
[START] [2022-06-01 12:05:13] denormalize_canonical_names_to_nodes
[STOP] [2022-06-01 12:05:26] denormalize_canonical_names_to_nodes
[START] [2022-06-01 12:05:26] match_nodes
[START] [2022-06-01 12:05:27] map_all_nodes_to_pages
[STOP] [2022-06-01 13:55:27] map_all_nodes_to_pages
[INFO] [2022-06-01 13:55:27] 6329 Unmatched nodes (of 438686)! That's too many to output. Full list in /app/public/data/wiki_english/unmatched_nodes.txt ; First 10: Canonical: Hayasakaia; Node#115134391; ResourceID: Q106169081; Canonical: Pseudagkistrodon rudis; Node#115136255; ResourceID: Q106595923; Canonical: Levicepolis; Node#115136926; ResourceID: Q106781518; Canonical: Melikaiella; Node#115136949; ResourceID: Q106784548; Canonical: Mesobaetis; Node#115137418; ResourceID: Q106950589; Canonical: Artigasia; Node#115137726; ResourceID: Q107029819; Canonical: Micromphalia; Node#115137857; ResourceID: Q107052159; Canonical: Cirratulida; Node#115138067; ResourceID: Q107122642; Canonical: Opheliida; Node#115138068; ResourceID: Q107122700; Canonical: Myenchildae; Node#115138074; ResourceID: Q107126081
[START] [2022-06-01 13:55:27] update_nodes
[STOP] [2022-06-01 13:55:49] update_nodes
[STOP] [2022-06-01 13:55:49] match_nodes
[START] [2022-06-01 13:55:49] reindex_search
[STOP] [2022-06-01 14:11:25] reindex_search
[START] [2022-06-01 14:11:25] normalize_units
[STOP] [2022-06-01 14:11:25] normalize_units
[START] [2022-06-01 14:11:25] calculate_statistics
[INFO] [2022-06-01 14:11:43] Duplicate page_id count: 0
[STOP] [2022-06-01 14:11:43] calculate_statistics
[START] [2022-06-01 14:11:43] complete_harvest_instance
[START] [2022-06-01 14:11:43] overall_tsv_creation
[INFO] [2022-06-01 14:11:44] Processing group of 438686 in 44 batches of 10000
[WARN] [2022-06-01 15:15:28] Encountered new rank, please ensure there are handlers for it: synonym
[INFO] [2022-06-01 15:34:25] Average Time: 54.896
[INFO] [2022-06-01 15:34:25] Total Time: 1h22m42s
[INFO] [2022-06-01 15:34:25] last 3 / first 3: 0.95
[INFO] [2022-06-01 15:34:25] Std.Dev: 3.364; Max: 68.34
[STOP] [2022-06-01 15:34:25] overall_tsv_creation
[INFO] [2022-06-01 15:34:25] Done. Check your files:
[INFO] [2022-06-01 15:34:25] (438686 lines) /app/public/data/wiki_english/publish_nodes.tsv
[INFO] [2022-06-01 15:34:25] (438691 lines) /app/public/data/wiki_english/publish_identifiers.tsv
[INFO] [2022-06-01 15:34:26] (10631434 lines) /app/public/data/wiki_english/publish_node_ancestors.tsv
[INFO] [2022-06-01 15:34:26] (438686 lines) /app/public/data/wiki_english/publish_scientific_names.tsv
[INFO] [2022-06-01 15:34:27] (6112095 lines) /app/public/data/wiki_english/publish_articles.tsv
[INFO] [2022-06-01 15:34:27] (832790 lines) /app/public/data/wiki_english/publish_content_sections.tsv
[STOP] [2022-06-01 15:34:27] complete_harvest_instance
[START] [2022-06-01 15:34:27] completed
[STOP] [2022-06-01 15:34:27] completed
[STOP] [2022-06-01 15:34:28] logged process, took 19440.29

Latest Process