Lack of interest 'Copy' function to remove sensitive data(Security) and personal information (GDPR) from Server Error Report

Alpha1 · May 31, 2019

When you report a bug to xenforo or to a developers site, then you will often need to copy the server error report from the xenforo admin panel to the forum of the developer/xenforo. This report often contains personal information like IP address, email, username. It also contains sensitive information like server path, domain name, database, IP.

I have reported thousands of issues over the years and I always edit it out. Which is a drag to do. Often people do not edit out such information. Therefore the xenforo bug forum and the bug forums of developers contain a lot of private/personal information which is not a good idea considering the GDPR and sensitive information which is not a good idea security wise.

It would be very handy if the server error report page would have a 'copy' button which would copy only the required data from the report. This tiny function would save a lot of work and prevent the above issues.

Kirby · May 31, 2019

The "required data" is not computable, so I doubt that would even be possible.

There is already some code to filter out things like passwords or XF root path (other paths should not really be sensitive, XF soruce code structure is known), but other than that it is not really possible to filter everything that might be sensitive.

Though it would be possible to skip the IP address (and username/userid, if any) of the client that generated the error.

Alpha1 · May 31, 2019

I am not suggesting the report itself needs to be sanitized. I am suggesting a function to sanitize a copy of the report. The developer of XenReviews created such function to remove such data from bug reports and he was finished coding within an hour.

IIRC it also removed thread titles for the case of Adult content sites. As long as the IDs are there titles are not needed.

Kirby · May 31, 2019

As aid before, this is simply not really possible.

Take the following example:

Code:

InvalidArgumentException: Template public:_page_node.7 error: Container key 'kirby_seo_pagecontext' was not found src/XF/Container.php:43
Generated by: kirby May 19, 2019 at 3:21 AM
Stack trace
#0 src/XF/App.php(2213): XF\Container->offsetGet('kirby_seo_pagecontext')
#1 internal_data/code_cache/templates/l3/s1/public/_page_node.7.php(9): XF\App->offsetGet('kirby_seo_pagecontext')
#2 src/XF/Template/Templater.php(1301): XF\Template\Templater->{closure}(Object(Kirby\Seo\XF\Template\Templater), Array)
#3 src/XF/Template/Templater.php(1374): XF\Template\Templater->renderTemplate('_page_node.7', Array)
#4 internal_data/code_cache/templates/l3/s1/public/page_view.php(104): XF\Template\Templater->includeTemplate('public:_page_no...', Array)
#5 src/XF/Template/Templater.php(1301): XF\Template\Templater->{closure}(Object(Kirby\Seo\XF\Template\Templater), Array)
#6 src/XF/Template/Template.php(24): XF\Template\Templater->renderTemplate('page_view', Array)
#7 src/XF/Mvc/Renderer/Html.php(48): XF\Template\Template->render()
#8 src/XF/Mvc/Dispatcher.php(418): XF\Mvc\Renderer\Html->renderView('XF:Page\\View', 'public:page_vie...', Array)
#9 src/XF/Mvc/Dispatcher.php(400): XF\Mvc\Dispatcher->renderView(Object(XF\Mvc\Renderer\Html), Object(XF\Mvc\Reply\View))
#10 src/XF/Mvc/Dispatcher.php(360): XF\Mvc\Dispatcher->renderReply(Object(XF\Mvc\Renderer\Html), Object(XF\Mvc\Reply\View))
#11 src/XF/Mvc/Dispatcher.php(53): XF\Mvc\Dispatcher->render(Object(XF\Mvc\Reply\View), 'html')
#12 src/XF/App.php(2177): XF\Mvc\Dispatcher->run()
#13 src/XF.php(390): XF\App->run()
#14 index.php(20): XF::runApp('XF\\Pub\\App')
#15 {main}

This clearly shows my username (Kirby), which you want to have removed.

It's easy to do that for "Generated by", but it's not possible fo the stack trace - are those occurances of "kirby" my username?
Or just some data that coincidentally contains the same text?

If you completely remove/censor "kirby" in that error, it becomes a lot less uesful - maybe even to the extend that it is unusable.

Robust · May 31, 2019

Alfa1 said:
As long as the IDs are there titles are not needed.

Depends on the bug. Excessive sanitisation is unnecessary and makes debugging harder.

Alfa1 said:
This report often contains personal information like IP address, email, username. It also contains sensitive information like server path, domain name, database, IP.

You could strip personal information (IP and email generally qualifies). Server path, domain name and database (name) aren't really sensitive, especially if logs are given to the developer privately. It should be a separate "strip personal data" button, if anything, though.

Alfa1 said:
Therefore the xenforo bug forum and the bug forums of developers contain a lot of private/personal information which is not a good idea considering the GDPR and sensitive information which is not a good idea security wise.

Not all developers use public forums for reporting bugs. In private areas, sometimes some forms of personal information (at least with how you classify things into that term) will be necessary for debugging. I've encountered some cases where users excessively censored information and it made it very difficult to debug the issue - forum admins are generally not great at deciding exactly what information to censor and what is relevant, and in cases of large datasets sometimes entire columns are filtered.

Nevertheless, users entrust the forum admin with their data. I'd be for any tools that make it easier for forum admins to censor information they feel shouldn't be shared. But if these are implemented, I would hope it's limited to actual personal information only, otherwise in some (many?) cases it's going to become painful for developers to get the data they need the first time.

If there is anything more than basic personal information (like emails and IPs) in your error log, you should really be sending it to the developer privately.

Kirby said:
As aid before, this is simply not really possible.

Take the following example:

This clearly shows my username (Kirby), which you want to have removed.

It's easy to do that for "Generated by", but it's not possible fo the stack trace - are those occurances of "kirby" my username?
Or just some data that coincidentally contains the same text?

If you completely remove/censor "kirby" in that error, it becomes a lot less uesful - maybe even to the extend that it is unusable.

The username isn't really sensitive imo, so that doesn't really matter. I would aim it more at emails and IPs.

You can definitely add means programmatically to flag certain fields as protected and not displayed in logs - for example, passwords aren't dumped in error logs iirc. Sentry's APIs for Laravel and Rails, for example, do this, as well - fields can definitely be hidden from logs. You can also use cheap regex but I'd be against that.

Also, in your example, it is very niche for the user generating the error to also be the developer of an add-on in the stacktrace. It's even more rare to be using your full username in the add-on (usually more an abbreviation of some sort). So even if direct find+replace was used, which isn't a great idea, it would only be problematic in very few cases. Probably more common in the XenForo Bug Reports forum, for bugs reported by developers, than it would be for forum admins reporting bugs to developers.

Alpha1 · May 31, 2019

Robust said:
Excessive sanitisation is unnecessary and makes debugging harder.

I have posted 1000+ bug reports and have never had a comment on the removal of this data.

Kirby said:
It's easy to do that for "Generated by", but it's not possible fo the stack trace - are those occurances of "kirby" my username?

For the username I'm only referring to 'generated by' .

Alpha1 · May 31, 2019

Robust said:
The username isn't really sensitive imo, so that doesn't really matter. I would aim it more at emails and IPs.

If the username is the persons real name or its identifiable then its an issue.

Robust · May 31, 2019

Alfa1 said:
I have posted thousands of bug report and have never had a comment on the removal of this data.

It would depend on the bug, of course. If the bug doesn't relate to those specific fields at all then, sure, you can probably remove it.

I've had quite a few bug reports. The majority could censor some information that's completely irrelevant for the bug, but as mentioned above, often forum admins aren't the best judges of what information is or isn't relevant in a particular case. Some might be pretty good at identifying bugs and whatnot and being able to dissect what is relevant, others cannot.

I've had a few cases where excessive removal of data in logs made it difficult or impossible to debug the issue, and further detail was required.

Kirby · May 31, 2019

Robust said:
The username isn't really sensitive imo, so that doesn't really matter. I would aim it more at emails and IPs.

Sure. But unless you known exactly from the stack trace that it is actually an email or IP address, you could only use some heuristic to guess what could be an email or IP address.
Think of a post message discussing SNMP OIDs, it's quite likely that it contains smth. that looks like an IP address while in fact it is not.

You can definitely add means programmatically to flag certain fields as protected and not displayed in logs - for example, passwords aren't dumped in error logs iirc. Sentry's APIs for Laravel and Rails, for example, do this, as well - fields can definitely be hidden from logs. You can also use cheap regex but I'd be against that.

That's already being done for some cases like password parameters. As said before, to be sure you need to know exactly what you are dealing with from the stack trace - in case you are not (like a post message could contain anything) you could only use a regex which might or might not catch what you are looking for.
So in a nutshell: It is not possible to do this in a waterproof way.

Robust · May 31, 2019

Alfa1 said:
If the username is the persons real name or its identifiable then its an issue.

I guess the user ID is also personal data then. So I guess that should also be stripped. If it isn't stripped, then the developer could just go to your site and lookup the user ID to get the username. Any thread titles, or IDs, or any sort of content should be stripped by this logic, really, because any such content could trace back to your site and result in identification of the user.

Pretty quickly this turns into a lot of stripping of data. Even data in Sentry from public app usage isn't that stripped - heck, Facebook was logging passwords in its internal application logs.

As I say, in general any kind of content can probably be stripped, but in some cases it can't (depending on the add-on and the nature of the bug). I'd rather not have to deal with excessive content stripping which could make some errors and issues harder to interpret and debug.

Alpha1 · May 31, 2019

Robust said:
I guess the user ID is also personal data then.

I dont think a user ID is a personal identifier. For example: you cannot search google or any other database for the userid to find the person. Its only unique for the person.

Lack of interest 'Copy' function to remove sensitive data(Security) and personal information (GDPR) from Server Error Report

Alpha1

Well-known member

Kirby

Well-known member

Alpha1

Well-known member

Kirby

Well-known member

Robust

Well-known member

Alpha1

Well-known member

Alpha1

Well-known member

Robust

Well-known member

Kirby

Well-known member

Robust

Well-known member

Alpha1

Well-known member

Similar threads

We value your privacy