|
OK, I'd be first to post a link to Parsing Html The Cthulhu Way
[^] if anyone suggests Regular Expressions, but I have a problem using an XmlDocument (and therefore XPath) with an HTML file I'm downloading.
The page is a list of files to download -- I need to extract the href s from the a s, obviously I'd prefer to use XPath to do that.
0) The file doesn't contain an opening <HTML> tag (it does have a closing </HTML> tag ) -- I can tack one on, that's not a big deal.
1) It contains at least one entity (and possibly other entities) and the XmlDocument doesn't like that.
So I need options, people!
I can summon Cthulhu.
I can use Regular Expressions to replace any offending entities and then feed the result to an XmlDocument.
What other options might there be?
|
|
|
|
|
HTML != XML
Use the HTML Agility Pack[^] instead.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Ah, sooooo... let the summoning begin!
Oh, mighty Cthulhu! Wise and terrible! I ask your assistance as my days have been blighted with some gnarly HTML! Please, oh lord, come smite the bare buttocks of the wretch who hast wrought this travesty. I will repay you with a pint of bitter. Not a measly USian pint mind you, but a proper Britsh pint.
|
|
|
|
|
No need to make that call to R'lyeh yet; the HAP makes parsing an HTML document simple:
HtmlDocument doc = new HtmlDocument();
doc.Load(@"path\to\your\file.htm");
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
{
string url = link["href"].Value;
Fhtagn(url);
}
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
If I take data (1,2,3,a,b,c,..) and I put it in XML form
(<a>, < b >, <c>) does that mean that data becomes information because of XML? Does XML turn data into information by representing it in tags?
I am also reading here in a slideshow: "In JDOM, every XML tree is approached as a document even though the content has nothing to do with documents". I looked up the definition of 'document' on dictionary.com and it states that a document is meant as being informative. 'informative' means 'to convey information'. Then, if the purpose of XML is to represent data into information, why does the content of an XML tree supposedly not have anything to do with a document and therefore nothing to do with information? This is confusing. Perhaps the author of that slideshow was using different semantics than I have in my mind right now.
Any thoughts on this?
|
|
|
|
|
I actually view XML documents as a datatable, and in fact you can store a datatable to XML format by the ToXML command.
|
|
|
|
|
Usually we meet various files like XLS,txt and others when preparing for a report, are their any easy ways or tools to make the format conversion more directly and easily?
|
|
|
|
|
XML is the data container; XSLT is the document that transforms the XML into some other data container -- its purpose is to do what you are asking. XSD is the schema -- it defines the format of the XML so that consumers can correctly instantiate objects that are contained in the XML document.
People are not seeming to understand the difference or the purpose of XML/XSD/XSLT -- and that is somewhat tragic because it is the best way to do things and maintain validity of data through disparate systems.
|
|
|
|
|
Please advice how I run xml using ecplipse:
am new to this field............
|
|
|
|
|
|
Hi Guys,
I have an Wix installer. I need to end one running task befor starting the upgrade. I tried the custom action through batch script and added to the sequence "InstallInitialize", but this custom action didn't lanuch before starting the upgrade and it didn't worked. Anybody have any idea, how to implement this.
Thank you very much for your help.
-Prasanth
|
|
|
|
|
Hi!I have a scenario.a company ask me to include a payment system in my application and this is the requirement :
a) we need you to provide us a location (in the form of a URL) which we can use to hit your server
b) we will send information regarding a payment transaction in the following XML format to your server at the URL address:
Request sent from ABC Company to My Company
="1.0"
<!DOCTYPE COMMAND PUBLIC "-//Ocam//DTD XML Command 1.0//EN" "xml/command.dtd">
<COMMAND>
<TYPE>PAYMENT</TYPE>
<COMPANYNAME>Luku</COMPANYNAME> <CUSTOMEREFID>refernceNumber</CUSTOMEREFID>
<MSISDN>msisdn</MSISDN>
<AMOUNT></AMOUNT>
<TXNID></TXNID>
<STATUS></STATUS>
</COMMAND>
c) on your side, we need you to interpret the information provide in the XML string, take whatever actions need to be taken on your platform, and then give us a response with the results of your actions in the following format:
Asynchronous response sent from My Company to ABC
<?xml version="1.0"?>
<!DOCTYPE COMMAND PUBLIC "-//Ocam//DTD XML Command 1.0//EN" "xml/command.dtd">
<COMMAND>
<TYPE>GBPHANDLER</TYPE>
<TXNID>BP120522.1117.C00001</TXNID>
<REFID>1357</REFID>
<RESULT>TS</RESULT>
<ERRORCODE>ERR123</ERRORCODE>
<FLAG>Y</FLAG>
<CONTENT>Content For Sending SMS</CONTENT>
<MSISDN>09899847486</MSISDN>
<COMPANYCODE>Luku</COMPANYCODE>
</COMMAND>
I have tried to resolve,this is my xml file I have created:
<?xml version="1.0"?>
<!DOCTYPE COMMAND PUBLIC "-//Ocam//DTD XML Command 1.0//EN" "xml/command.dtd">
<COMMAND>
<TYPE>GBPHANDLER</TYPE>
<TXNID>BP120522.1117.C00001</TXNID>
<REFID>1357</REFID>
<RESULT>TS</RESULT>
<ERRORCODE>ERR123</ERRORCODE>
<FLAG>Y</FLAG>
<CONTENT>Content For Sending SMS</CONTENT>
<MSISDN>09899847486</MSISDN>
<COMPANYCODE>Luku</COMPANYCODE>
</COMMAND>
and this is my .dtd file :
<?xml version="1.0"?>
<!DOCTYPE COMMAND PUBLIC "-//Ocam//DTD XML Command 1.0//EN" "xml/command.dtd">
<COMMAND>
<TYPE>GBPHANDLER</TYPE>
<TXNID>BP120522.1117.C00001</TXNID>
<REFID>1357</REFID>
<RESULT>TS</RESULT>
<ERRORCODE>ERR123</ERRORCODE>
<FLAG>Y</FLAG>
<CONTENT>Content For Sending SMS</CONTENT>
<MSISDN>09899847486</MSISDN>
<COMPANYCODE>Luku</COMPANYCODE>
</COMMAND>
Now I am stack on that level.My question is this Which url I need to provide? how to make a generated xml file?Am I need to store this information into a database? because the system need to verify an incomming request everytime a customer need to make a payment by verifying a reference code generated from the system and check if it is matching the amount of a product he want to buy.Please need an idea.I am spending a lot of time learning xml but I am not finding the answer.By the way I am using c#.net.Thanks!
|
|
|
|
|
I would suppose the URL you need to give will be the address url of your HANDLER to handle the client companys request.... (the page you make to process the request)
So if your handler was called TransactionValidator.php then the url would be....
http://www.yourcompany.net/some-folder/TransactionValidator.php
I thought a .dtd was a document type definition in which case it should describe the data format of your xml file (I recommend you check this - as i may be wrong).
Yes you need to persist the request (and the results) in a database for future refrence in case things go pear shaped!!
|
|
|
|
|
|
Hi!I am putting this code in xml file:
<COMMAND>
<TYPE>GBPHANDLER</TYPE>
<TXNID>BP120522.1117.C00001</TXNID>
<REFID>1357</REFID>
<RESULT>TS</RESULT>
<ERRORCODE>ERR123</ERRORCODE>
<FLAG>Y</FLAG>
<CONTENT>Content For Sending SMS</CONTENT>
<MSISDN>09899847486</MSISDN>
<COMPANYCODE>Luku</COMPANYCODE>
</COMMAND>
but I am getting this error in the code:
Could not find a part of the path 'C:\MyProject\xml\command.dtd'.
Am I need to create a file with .dtd extension or what? please can someone help to resolve this.
|
|
|
|
|
You have told the XML reader that the schema for your XML is held in file xml/command.dtd, so you need to provide that DTD file also.
One of these days I'm going to think of a really clever signature.
|
|
|
|
|
Ohh thanks.I didn't know that it is necessary to create a separate file.It is my first time to work with dtd and xml file so...
|
|
|
|
|
In that case I would suggest you spend some time learning the structure of XML and DTD files before proceeding further.
One of these days I'm going to think of a really clever signature.
|
|
|
|
|
Hello. I'm new to xml and i have a big problem. How to generate an xml sample document from this schema ?
<xs:schema targetNamespace="http://www.csi.it/sigit/sigitweb/xml/import" elementFormDefault="qualified" attributeFormDefault="unqualified"><!-- ELENCO IMPIANTI --><xs:element name="ElencoImpianti"></xs:element><!-- ******************************* --><!-- DEFINIZIONE SEZIONI --><!-- ******************************* --><!-- IMPIANTO --><xs:element name="Impianto"></xs:element><!-- ******************************* --><!-- DEFINIZIONE TIPI COMPLESSI --><!-- ******************************* --><!-- MANUTENTORE --><xs:element name="Manutentore"></xs:element><!-- IDENTIFICAZIONE IMPIANTO --><xs:element name="IdentificazioneImpianto"></xs:element><!-- ELENCO RESPONSABILI --><xs:complexType name="ElencoResponsabili"></xs:complexType><!-- RESPONSABILE --><xs:element name="Responsabile"></xs:element><!-- ELENCO EDIFICI --><xs:complexType name="ElencoAltriEdifici"></xs:complexType><!-- ALTRI EDIFICI --><xs:element name="AltriEdifici"></xs:element><!-- ELENCO APPARECCHIATURE --><xs:complexType name="ElencoApparecchiature"></xs:complexType><!-- APPARECCHIATURA --><xs:element name="Apparecchiatura"></xs:element><!-- ELENCO ALLEGATOG --><xs:complexType name="ElencoAllegatoG"></xs:complexType><!-- ALLEGATOG --><xs:element name="AllegatoG"></xs:element><!-- ELENCO ALLEGATOF --><xs:complexType name="ElencoAllegatoF"></xs:complexType><!-- ALLEGATOF --><xs:element name="AllegatoF"></xs:element><xs:element name="ElencoPagine"></xs:element><!-- PAGINA ALLEGATOF --><xs:element name="Pagina"></xs:element></xs:schema>
I browser some tips/tricks and read some tutorials, but i don t get the way it works an xml file without having a root element, but having 5 global elements?
Thanks in advance!
|
|
|
|
|
I tried to import the schema into VS 2010 and Excel and both say not a valid schema. So the answer is you can't generate a sample document from the schema and you should go to the source and get a corrected one.
|
|
|
|
|
If there is no root, you do not have a valid document. I think you are confusing the XML declaration with the root -- the root is the first element that is NOT the XML declaration...what you have posted -- is an XML schema fragment.
|
|
|
|
|
Consider the following XML file:
<handbook title="HangGlider">
<section title="Red Jewel 1:1" />
<section title="Red Jewel 1:2" />
<section title="Red Jewel 1:3" />
<section title="Red Jewel 1:4" />
<section title="Green Jewel 1:1" />
<section title="Green Jewel 1:2" />
<section title="Green Jewel 1:3" />
<section title="Green Jewel 1:4" />
</handbook>
<handbook title="WingRunner">
<section title="Red Jewel 1:1" />
<section title="Red Jewel 1:2" />
<section title="Red Jewel 1:3" />
<section title="Red Jewel 1:4" />
<section title="Green Jewel 1:1" />
<section title="Green Jewel 1:2" />
<section title="Green Jewel 1:3" />
<section title="Green Jewel 1:4" />
</handbook>
<handbook title="SkyStormer">
<section title="Red Jewel 1:1" />
<section title="Red Jewel 1:2" />
<section title="Red Jewel 1:3" />
<section title="Red Jewel 1:4" />
<section title="Green Jewel 1:1" />
<section title="Green Jewel 1:2" />
<section title="Green Jewel 1:3" />
<section title="Green Jewel 1:4" />
</handbook> I would like to lay the file out so that the repeated data only exists once, but can be shared by multiple handbooks. Is that possible?
"One man's wage rise is another man's price increase." - Harold Wilson
"Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons
"Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous
|
|
|
|
|
Do you mean like this?
<handbooks>
<CommonSections>
<section title="Red Jewel 1:1" InBooks="1,2,3"/>
<section title="Red Jewel 1:2" InBooks="1,2,3"/>
<section title="Red Jewel 1:3" InBooks="1,2,3"/>
<section title="Red Jewel 1:4" InBooks="1,2,3"/>
<section title="Green Jewel 1:1" InBooks="1,2,3"/>
<section title="GreenJewel 1:2" InBooks="1,2,3"/>
<section title="Green Jewel 1:3" InBooks="1,2,3"/>
<section title="Green Jewel 1:4" InBooks="1,2,3"/>
</CommonSections>
<handbook id="1" title="HangGlider">
</handbook>
<handbook id="2" title="WingRunner">
</handbook>
<handbook id="3" title="SkyStormer">
</handbook>
</handbooks>
|
|
|
|
|
I am a total rookie so forgive my ignorance in advance.
I am (trying) to use XSLT to transform XML output from my Filemaker database so that I can re-import it (to synchronize the data after it has been copy-edited externally).
The whole business works fine with one exception. In the xml file, some text has visual tags such as <hi rend="italic"></hi>.
The xslt does not bring this tagging into the transformed file. I have tried using
<xsl:copy-of select="note"/> and
<xsl:value-of select="note"/>
either one of these brings the text perfectly, but not the <hi> tags.
ideas?
I'm using oXygen 12.2 and FMProAdvanced 11.
thanks!
|
|
|
|
|
Turns out "copy-of" was working, it was FMPro that is stripping the angle bracket tags out of the file on import. BAD Filemaker!
|
|
|
|