Add true picaxml handler #530#761
Conversation
Rename old PicaXmlHandler to PpXmlHandler. Then handler names follow the naming convention as stated here: https://format.gbv.de/pica
|
Unfortunatly the old pica handler had no tests and I added no new java tests, but I tested it with a local distribution and the following test workflows: Input Flux 1: Output: and: Flux 2: Output: |
dr0i
left a comment
There was a problem hiding this comment.
Missing tests in other modules should not be an excuse for dismissing tests in new modules. As more as I think in the code atm are bugs - which would reveal themselves if properly tested.
Ideall, the rename PpxXmlHandler would also get a test (but that would be nice to have and is up to you).
Also, we shouldn't forget to mark this as a breaking change!
| /* | ||
| * Copyright 2014 hbz, Pascal Christoph | ||
| * Copyright 2013, 2014 Deutsche Nationalbibliothek | ||
| * |
There was a problem hiding this comment.
Why not hbz? Didin't you/we at hbz implement it?
There was a problem hiding this comment.
Adjusted as suggested
There was a problem hiding this comment.
But you added M. Geipel as author. As he worked at the dnb we should than not delete dnb here, right?
There was a problem hiding this comment.
I am not sure. This is a reuse of the existing marcXmlHandler btw. this is also already heavily modified by hbz workers. I am not sure. Perhaps I should just refer to the other class? And delete Geipel?
| * | ||
| * @author Pascal Christoph (dr0i) | ||
| * A Pica xml reader. To read marc data without namespace specification set option `namespace=""` or to null when using JAVA code. | ||
| * @author Tobias Bülte |
There was a problem hiding this comment.
I think ,as not little of the code was just a reuse of the former code, you just add you name as author , not deleting the first one.
There was a problem hiding this comment.
You are still the creator of PpXmlHandler, PicaXmlHandler is a reuse of MarcXmlHandler which was created by Markus Michael Geipel
| * | ||
| */ | ||
| @Description("A pica xml reader") | ||
| @Description("A Pica XML reader. To read pica data without namespace specification set option `namespace=\"\"`. To ignore namespace specification set option `ignorenamespace=\"true\".") |
There was a problem hiding this comment.
Maybe here we should link to the new PpXmlHandler and hint when to use it?
| * A pica xml handler. | ||
| * | ||
| * @author Pascal Christoph (dr0i) | ||
| * A Pica xml reader. To read marc data without namespace specification set option `namespace=""` or to null when using JAVA code. |
There was a problem hiding this comment.
Better use capital letters "XML" like below.
| * @param namespace the namespace. Set to null if namespace shouldn't be checked. Set to empty string | ||
| * if the namespace is missing in the data. | ||
| */ | ||
| public void setNamespace(final String namespace) { |
There was a problem hiding this comment.
There is only one namespace valid (see below checkNamespacelogic). Everything not being tat namespace, i.e. also other namespaces, will be ignored. This should be documented.
There was a problem hiding this comment.
This is a reuse of the mechanism from MarcXmlHandler
There was a problem hiding this comment.
I reused the code and the documentation
| currentTag = attributes.getValue("tag"); | ||
| } | ||
| else if (RECORD.equals(localName) && checkNamespace(uri)) { | ||
| getReceiver().startRecord(""); |
There was a problem hiding this comment.
Not tested - but what if the the namespace ist set to be ignored? It won't work then, won't start the record, no?
|
I've not yet reviewed the code thoroughly because I think you first have to provide some tests so we can assure this code is working as proposed. |
974a033 to
563f8b0
Compare
Also improve some documentation
|
I added tests and adjusted the code. Also the initial implementation ignored the occurence these are now added as part of the element label if they exist. |
| /* | ||
| * Copyright 2014 hbz, Pascal Christoph | ||
| * Copyright 2013, 2014 Deutsche Nationalbibliothek | ||
| * |
There was a problem hiding this comment.
But you added M. Geipel as author. As he worked at the dnb we should than not delete dnb here, right?
In order to encode pica data the handling of occurence must be adjusted and add an slash.
|
The latest commit daa9e5f conforms the handling of occurrence by Field |
Rename old PicaXmlHandler to PpXmlHandler. Then handler names follow the naming convention as stated here: https://format.gbv.de/pica
Resolves #530
This was a very quick addition since picaxml follows the convention of marcxml minus a record type and indicators.
The old picaxml opener just opened ppxml and not picaxml. I renamed the old PicaXmlHandler opener to PpXmlHandler.