perl - How can I concatenate multiple XML files? -
how can concatenate multiple xml files different directories single xml file using perl?
i've had make quite lot of assumptions this, here's answer:
#!/usr/bin/perl -w use strict; use xml::libxml; $output_doc = xml::libxml->load_xml( string => <<eof); <?xml version="1.0" ?> <issu-meta xmlns="ver2"> <metadescription> <num-objects xml:id='total'/> </metadescription> <compatibility> <baseline> 6.2.1.2.43 </baseline> </compatibility> </issu-meta> eof $object_count = 0; foreach (@argv) { $input_doc = xml::libxml->load_xml( location => $_ ); foreach ($input_doc->findnodes('/*[local-name()="issu-meta"]/*[local-name()="basictype"]')) { # find each object $object = $output_doc->importnode($_, 1); # import object information output document $output_doc->documentelement->appendchild($object); # append new xml nodes output document root $object_count++; # keep track of how many objects we've seen } } $total = $output_doc->getelementbyid('total'); # find element contain object count $total->appendchild($output_doc->createtextnode($object_count)); # append object count element $total->removeattribute('xml:id'); # remove xml id, it's not wanted in output print $output_doc->tostring; # output final document
firstly, <comp>
element seems come nowhere, i've had ignore that. i'm assuming required output content before each of <basictype>
elements going same, except object count.
so build empty output document start with, , iterate on each filename provided on commandline. each, find each object , copy output file. once i've done input files, insert object count.
it's made more difficult use of xmlns
on files. makes xpath search expression more complicated needs be. if possible, i'd tempted remove xmlns
attributes , you'd left with:
foreach ($input_doc->findnodes('/issu-meta/basictype')) {
which lot simpler.
so, when run this:
perl combine abc/a.xml xyz/b.xml
i get:
<?xml version="1.0"?> <issu-meta xmlns="ver2"> <metadescription> <num-objects>3</num-objects> </metadescription> <compatibility> <baseline> 6.2.1.2.43 </baseline> </compatibility> <basictype> <id> 1 </id> <name> pointer </name> <pointer/> <size> 64 </size> </basictype><basictype> <id> 4 </id> <name> int32_t </name> <primitive/> <size> 32 </size> </basictype><basictype> <id> 2 </id> <name> int8_t </name> <primitive/> <size> 8 </size> </basictype></issu-meta>
which pretty close you're after.
edit: ok, answer looks this:
#!/usr/bin/perl -w use strict; use xml::libxml qw( :libxml ); # load libxml support , include node type definitions $output_doc = xml::libxml->load_xml( string => <<eof); # create empty output document <?xml version="1.0" ?> <issu-meta xmlns="ver2"> <metadescription> <num-objects xml:id='total'/> </metadescription> <compatibility> <baseline> 6.2.1.2.43 </baseline> </compatibility> </issu-meta> eof $object_count = 0; foreach (@argv) { $input_doc = xml::libxml->load_xml( location => $_ ); $import_started = 0; foreach ($input_doc->documentelement->childnodes) { next unless $_->nodetype == xml_element_node; # if it's not element, ignore if ($_->localname eq 'compatibility') { # if it's "compatibility" element, ... $import_started = 1; # ... switch on importing ... next; # ... , move next child of root } next unless $import_started; # if we've not started importing, , it's # not "compatibility" element, # ignore , move on $object = $output_doc->importnode($_, 1); # import object information output document $output_doc->documentelement->appendchild($object); # append new xml nodes output document root $object_count++; # keep track of how many objects we've seen } } $total = $output_doc->getelementbyid('total'); # find element contain object count $total->appendchild($output_doc->createtextnode($object_count)); # append object count element $total->removeattribute('xml:id'); # remove xml id, it's not wanted in output print $output_doc->tostring; # output final document
which imports each element child of root <issu-meta>
document element after first <compatibility>
element finds, and, before, updates object count. if i've understood requirement should you.
if works, suggest work through both answer , earlier 1 ensure understant why works problem. there lots of useful technologies used in here, , once understand it, have learned lot of ways can manipulate xml. problems, ask new question on site. have fun!
edit #2: right, should last piece need:
#!/usr/bin/perl -w use strict; use xml::libxml qw( :libxml ); # load libxml support , include node type definitions @input_files = ( 'abc/a.xml', 'xyz/b.xml', ); $output_file = 'output.xml'; $output_doc = xml::libxml->load_xml( string => <<eof); # create empty output document <?xml version="1.0" ?> <issu-meta xmlns="ver2"> <metadescription> <num-objects xml:id='total'/> </metadescription> <compatibility> <baseline> 6.2.1.2.43 </baseline> </compatibility> </issu-meta> eof $object_count = 0; foreach (@input_files) { $input_doc = xml::libxml->load_xml( location => $_ ); $import_started = 0; foreach ($input_doc->documentelement->childnodes) { next unless $_->nodetype == xml_element_node; # if it's not element, ignore if ($_->localname eq 'compatibility') { # if it's "compatibility" element, ... $import_started = 1; # ... switch on importing ... next; # ... , move next child of root } next unless $import_started; # if we've not started importing, , it's # not "compatibility" element, # ignore , move on $object = $output_doc->importnode($_, 1); # import object information output document $output_doc->documentelement->appendchild($object); # append new xml nodes output document root $object_count++; # keep track of how many objects we've seen } } $total = $output_doc->getelementbyid('total'); # find element contain object count $total->appendchild($output_doc->createtextnode($object_count)); # append object count element $total->removeattribute('xml:id'); # remove xml id, it's not wanted in output $output_doc->tofile($output_file, 1); # output final document
after running this: perl combine
file output.xml
created, following contents:
<?xml version="1.0"?> <issu-meta xmlns="ver2"> <metadescription> <num-objects>7</num-objects> </metadescription> <compatibility> <baseline> 6.2.1.2.43 </baseline> </compatibility> <basictype> <id> 1 </id> <name> pointer </name> <pointer/> <size> 64 </size> </basictype><basictype> <id> 4 </id> <name> int32_t </name> <primitive/> <size> 32 </size> </basictype><enum> <id>1835009 </id> <name> chkpt_state_t </name> <label> <name> chkp_state_pending </name> <value> 1 </value> </label> </enum><struct> <id> 1835010 </id> <name> _ipcendpoint </name> <size> 64 </size> <elem> <id> 0 </id> <name> ep_addr </name> <type> uint32_t </type> <type-id> 8 </type-id> <size> 32 </size> <offset> 0 </offset> </elem> </struct><basictype> <id> 2 </id> <name> int8_t </name> <primitive/> <size> 8 </size> </basictype><alias> <id> 1835012 </id> <name> endpoint </name> <size> 64 </size> <type> _ipcendpoint </type> <type-id> 1835010 </type-id> </alias><bitmask> <id> 1835015 </id> <name> ipc_flag_t </name> <size> 8 </size> <type> uint8_t </type> <type-id> 6 </type-id> <label> <name> ipc_application_register_msg </name> <value> 1 </value> </label> </bitmask></issu-meta>
last tip: although makes no difference xml, it's little more human-readable once it's been run through xmltidy
:
<?xml version="1.0"?> <issu-meta xmlns="ver2"> <metadescription> <num-objects>7</num-objects> </metadescription> <compatibility> <baseline> 6.2.1.2.43 </baseline> </compatibility> <basictype> <id> 1 </id> <name> pointer </name> <pointer/> <size> 64 </size> </basictype> <basictype> <id> 4 </id> <name> int32_t </name> <primitive/> <size> 32 </size> </basictype> <enum> <id>1835009 </id> <name> chkpt_state_t </name> <label> <name> chkp_state_pending </name> <value> 1 </value> </label> </enum> <struct> <id> 1835010 </id> <name> _ipcendpoint </name> <size> 64 </size> <elem> <id> 0 </id> <name> ep_addr </name> <type> uint32_t </type> <type-id> 8 </type-id> <size> 32 </size> <offset> 0 </offset> </elem> </struct> <basictype> <id> 2 </id> <name> int8_t </name> <primitive/> <size> 8 </size> </basictype> <alias> <id> 1835012 </id> <name> endpoint </name> <size> 64 </size> <type> _ipcendpoint </type> <type-id> 1835010 </type-id> </alias> <bitmask> <id> 1835015 </id> <name> ipc_flag_t </name> <size> 8 </size> <type> uint8_t </type> <type-id> 6 </type-id> <label> <name> ipc_application_register_msg </name> <value> 1 </value> </label> </bitmask> </issu-meta>
good luck working through , taking further. come site ask more questions when come up!
Comments
Post a Comment