peregm
Member
Hello there!
I've been developing some add-ons for my forum and I've been struggling when working with large quantities of data.
The thing is I have some data that I must process (let's say over 13,000,000 entries) one by one and do some actions with them. For the sake of the example, let's assume we are talking about attachments.
So, when looping through all the attachments, what I'd normally do is get the entities by chunks (i.e. fetching items in offsets of 500), so I don't have to keep all the data in memory. This is what I actually have:
This runs smoothly until the 65th loop or so, when I get a memory exhausted error. What I assume is happening is that some data is being kept in memory (in the fetch method).
I've tried clearing the
I know I can change my
What am I missing?
Thanks!
I've been developing some add-ons for my forum and I've been struggling when working with large quantities of data.
The thing is I have some data that I must process (let's say over 13,000,000 entries) one by one and do some actions with them. For the sake of the example, let's assume we are talking about attachments.
So, when looping through all the attachments, what I'd normally do is get the entities by chunks (i.e. fetching items in offsets of 500), so I don't have to keep all the data in memory. This is what I actually have:
PHP:
protected function execute(InputInterface $input, OutputInterface $output)
{
$offset = 0;
$attachments = $this->getAttachments($offset);
while ($attachments->count()) {
foreach ($attachments as $attachment) {
// Do something
}
$attachments = $this->getAttachments(++$offset);
}
return 1;
}
private function getAttachments(int $offset): \XF\Mvc\Entity\ArrayCollection
{
$limit = 500;
return \XF::finder('XF:Attachment')
->limit($limit, $offset*$limit)
->fetch();
}
This runs smoothly until the 65th loop or so, when I get a memory exhausted error. What I assume is happening is that some data is being kept in memory (in the fetch method).
I've tried clearing the
$attachments
variable setting it to null, also I tried closing the connection before fetching the new batch of items but nothing seems to work.I know I can change my
memory_limit
, but I do not have unlimited memory and, the ideal solution would be to actually clear all the unnecessary data that causes memory leaks.What am I missing?
Thanks!