How to parse html text having singleton tag?
古いコメントを表示
I have a htmldata stored in the form of char array
<html>
<head>
</head>
<body>
<div class="header">
HEADER1
</div>
<div class="content">
<br>my data
</div>
</body>
</html>
I want to retreive data between tags for which i tried some thing like
import javax.xml.parsers.DocumentBuilderFactory
dbf=javax.xml.parsers.DocumentBuilderFactory.newInstance();
builder = dbf.newDocumentBuilder();
is=org.xml.sax.InputSource(java.io.StringReader(htmldata));
dom=builder.parse(is);
The above code works fine when there are no singleton tags. but it throws error when I add singleton tags [Fatal Error] :36:7: The element type "br" must be terminated by the matching end-tag "</br>".
even xmlread throwns same error
>> xmlread(is)
[Fatal Error] :The element type "br" must be terminated by the matching end-tag "</br>".
Error using xmlread (line 106)
Java exception occurred:
org.xml.sax.SAXParseException; The element type "br" must be terminated by the matching end-tag
"</br>".
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
is there any workaround for this ?
1 件のコメント
Purav Panchal
2022 年 8 月 31 日
Hey, did you find any solution?
採用された回答
その他の回答 (0 件)
カテゴリ
ヘルプ センター および File Exchange で Historical Contests についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!