-
Notifications
You must be signed in to change notification settings - Fork 98
Description
Hello.
Thanks for creating this parser, it works fast and has nice documentation, really wonderful job. Unfortunately, I ran into issues with pausing stream. I need it for writing record to database before continuing.
I noticed incorrect behavior when I call stop/resume in closing tag handler, and I got out of memory crash on big xml file, but unfortunately I cannot provide this file, as it contains corporate information.
So, to reproduce, use scripts below. First script will generate large XML file, and second will show error. While there's 100000 records in sample.xml, parser will show less closed tags (in my case, 3274). If I remove stop and resume calls, it shows 100000 as it should.
On another file (which I cannot provide unfortunately) it causes out of memory crash. I will try to create auto-generated file which can reproduce this issue.
Sample generator:
const fs = require('fs')
const stream = require('stream')
const SAMPLE_RECORDS_COUNT = 100000
const SAMPLE_PATH = './sample.xml'
class DataStream extends stream.Readable {
constructor() {
super()
this.count = 0
}
_read() {
if (this.count === 0) {
this.push('<?xml version="1.0" encoding="UTF-8"?><root>')
this.count++
return;
}
this.count++
if (this.count > SAMPLE_RECORDS_COUNT) {
this.push('</root>')
this.push(null)
} else {
this.push('<child>text</child>\n')
}
}
}
const ws = fs.createWriteStream(SAMPLE_PATH)
const ds = new DataStream()
ds.pipe(ws).on('close', () => {
console.log('done')
process.exit()
})Reproduce:
const fs = require('fs')
const expat = require('node-expat')
console.log('started')
const parser = new expat.Parser('UTF-8')
const rs = fs.createReadStream('./sample.xml')
parser.on('startElement', (elt, attrs) => {
})
let count = 0
parser.on('endElement', async (elt, attrs) => {
count++
parser.stop()
parser.resume()
})
parser.on('error', console.error)
rs.on('end', () => console.log(count))
rs.pipe(parser)