Can't fix XFES 2.0 Beta Rebuild Error

MattW

Well-known member
Affected version
2.0 Beta 1
Code:
XFES\Elasticsearch\BulkRequestException: Elasticsearch indexing error: Elasticsearch bulk action error (first error: [thread-1282] failed to parse [prefix]) src/addons/XFES/Elasticsearch/Api.php:394
Generated by: Matt Sep 6, 2017 at 11:33 AM
Stack trace
#0 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/addons/XFES/Elasticsearch/Api.php(171): XFES\Elasticsearch\Api->bulkRequest('{"index":{"_ind...')
#1 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/addons/XFES/Search/Source/Elasticsearch.php(82): XFES\Elasticsearch\Api->indexBulk(Array)
#2 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/addons/XFES/Search/Source/Elasticsearch.php(57): XFES\Search\Source\Elasticsearch->flushBulkIndexing()
#3 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF/Search/Search.php(40): XFES\Search\Source\Elasticsearch->index(Object(XF\Search\IndexRecord))
#4 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF/Search/Search.php(59): XF\Search\Search->index('post', Object(XF\Entity\Post))
#5 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF/Search/Search.php(85): XF\Search\Search->indexEntities('post', Object(XF\Mvc\Entity\ArrayCollection))
#6 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF/Job/SearchRebuild.php(57): XF\Search\Search->indexRange('post', 4914, '500')
#7 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF/Job/Manager.php(193): XF\Job\SearchRebuild->run(8)
#8 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF/Job/Manager.php(140): XF\Job\Manager->runJobInternal(Array, 8)
#9 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF/Admin/Controller/Tools.php(92): XF\Job\Manager->runJobEntry(Array, 8)
#10 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF/Mvc/Dispatcher.php(232): XF\Admin\Controller\Tools->actionRunJob(Object(XF\Mvc\ParameterBag))
#11 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF/Mvc/Dispatcher.php(85): XF\Mvc\Dispatcher->dispatchClass('XF:Tools', 'RunJob', 'html', Object(XF\Mvc\ParameterBag), 'tools')
#12 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF/Mvc/Dispatcher.php(41): XF\Mvc\Dispatcher->dispatchLoop(Object(XF\Mvc\RouteMatch))
#13 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF/App.php(1771): XF\Mvc\Dispatcher->run()
#14 /home/nginx/domains/xf2.mattwservices.co.uk/public/src/XF.php(319): XF\App->run()
#15 /home/nginx/domains/xf2.mattwservices.co.uk/public/admin.php(13): XF::runApp('XF\\Admin\\App')
#16 {main}
Request state
array(3) {
  ["url"] => string(24) "/admin.php?tools/run-job"
  ["_GET"] => array(1) {
    ["tools/run-job"] => string(0) ""
  }
  ["_POST"] => array(4) {
    ["_xfRedirect"] => string(65) "https://xf2.mattwservices.co.uk/admin.php?tools/rebuild&success=1"
    ["_xfToken"] => string(8) "********"
    ["only"] => string(23) "RebuildXF:SearchRebuild"
    ["only_id"] => string(1) "0"
  }
}

Not sure if this is related to an old addon??
 
Did you have an add-on that allowed multiple prefixes to be specified for a thread?

If so, I suspect it's caused by that. I've not seen the add-on, but if I recall, it actually changed core schema components which is generally a very bad idea.

If you have that, can you report the column definition for the prefix_id column in the xf_thread table?
 
Can you show me the schema of the xf_thread table (and the prefix_id column specifically)? (@Xon, you might be able to help -- I don't have access to the add-on.) Based on some things I've seen in the thread, I know it changes the format of the prefix_id column.

I'm not really sure what the best way to solve this is. I assume that in XF1, the add-on had changes to how prefix_id got indexed. Alternatively, this is potentially related to the fact that we declare prefix to be an int (or array of ints) explicitly in XF2. Without that add-on running, it'll try to index whatever is actually in the DB and that won't work. This will be causing some odd behavior in the default search too, though not an explicit error.

Assuming the add-on were updated to XF2 using its current approach, this same issue would happen if it were disabled, so clearly this approach is a definite no-go on that basis. I'm just not sure what the best approach for people who have installed it previously would be...
 
Code:
MariaDB [forum]> describe xf_thread;
+--------------------+---------------------------------------+------+-----+---------+----------------+
| Field              | Type                                  | Null | Key | Default | Extra          |
+--------------------+---------------------------------------+------+-----+---------+----------------+
| thread_id          | int(10) unsigned                      | NO   | PRI | NULL    | auto_increment |
| node_id            | int(10) unsigned                      | NO   | MUL | NULL    |                |
| title              | varchar(150)                          | NO   |     | NULL    |                |
| reply_count        | int(10) unsigned                      | NO   |     | 0       |                |
| view_count         | int(10) unsigned                      | NO   |     | 0       |                |
| user_id            | int(10) unsigned                      | NO   | MUL | NULL    |                |
| username           | varchar(50)                           | NO   |     | NULL    |                |
| post_date          | int(10) unsigned                      | NO   | MUL | NULL    |                |
| sticky             | tinyint(3) unsigned                   | NO   |     | 0       |                |
| discussion_state   | enum('visible','moderated','deleted') | NO   |     | visible |                |
| discussion_open    | tinyint(3) unsigned                   | NO   |     | 1       |                |
| discussion_type    | varchar(25)                           | NO   |     |         |                |
| first_post_id      | int(10) unsigned                      | NO   |     | NULL    |                |
| first_post_likes   | int(10) unsigned                      | NO   |     | 0       |                |
| last_post_date     | int(10) unsigned                      | NO   | MUL | NULL    |                |
| last_post_id       | int(10) unsigned                      | NO   |     | NULL    |                |
| last_post_user_id  | int(10) unsigned                      | NO   |     | NULL    |                |
| last_post_username | varchar(50)                           | NO   |     | NULL    |                |
| prefix_id          | varbinary(255)                        | NO   |     | 0       |                |
| cta_ft_featured    | tinyint(3) unsigned                   | NO   |     | 0       |                |
| custom_fields      | mediumblob                            | YES  |     | NULL    |                |
| tags               | mediumblob                            | NO   |     | NULL    |                |
+--------------------+---------------------------------------+------+-----+---------+----------------+
22 rows in set (0.00 sec)
 
Multiprefix stores as a comma-separated list which is then exploded into an array of ints when indexing (or prepared). Elastic Search doesn't actually care if it is an int or an array of ints for indexing purposes.
 
The following SQL can be used to remove Multi Prefix's SQL changes:
Code:
UPDATE xf_thread SET prefix_id=IF(LEFT(prefix_id,LOCATE(',',prefix_id) - 1) != '', LEFT(prefix_id,LOCATE(',',prefix_id) - 1), prefix_id);
update xf_thread SET prefix_id='0' where prefix_id = '';

alter table xf_thread change column prefix_id prefix_id int default 0;

UPDATE xf_resource SET prefix_id=IF(LEFT(prefix_id,LOCATE(',',prefix_id) - 1) != '', LEFT(prefix_id,LOCATE(',',prefix_id) - 1), prefix_id);
update xf_resource SET prefix_id='0' where prefix_id = '';

alter table xf_resource change column prefix_id prefix_id int default 0;
 
Technically, those alters should have "unsigned" as well. Of course, doing this would also lead to data loss (the secondary prefixes).

So strictly speaking, I'm going to tag this as "can't fix" as we wouldn't try to undo conflicting schema changes from an add-on, particularly where it's clear it will cause data loss. However, in beta 2, we are detecting known conflicts based some specific unexpected schema changes and blocking the upgrade from starting. We're adding the prefix_id change to xf_thread here, so Multi Prefix would need to be uninstalled before upgrading or an alternative version would need to be done that doesn't change the structure of a core column.

While I don't know any of the technical details, I'd likely recommend moving to storing multi-prefix data into prefix_ids and keeping the first one synced with prefix_id. This should be easier in XF2, so assuming the add-on is going to be updated for XF2, it may be easiest to do that as an "upgrade prep" release.
 
I am planning on porting MultiPrefix to XF2, but just haven't had the time to look at it yet.

While I don't know any of the technical details, I'd likely recommend moving to storing multi-prefix data into prefix_ids and keeping the first one synced with prefix_id. This should be easier in XF2, so assuming the add-on is going to be updated for XF2, it may be easiest to do that as an "upgrade prep" release.
I've actually started on this working towards restoring the original prefix_id and using a different schema for storing the data. Just haven't had time to finish it, and it hasn't been a priority.
 
Top Bottom