Vendor import of expat 2.7.5

This commit is contained in:
Philip Paeps
2026-04-01 16:49:18 +08:00
parent a8fa7ccb47
commit f5b5e29279
33 changed files with 3817 additions and 2334 deletions
+141 -18
View File
@@ -10,37 +10,160 @@
!! ~~~~~~~~~~~~ !!
!! The following topics need *additional skilled C developers* to progress !!
!! in a timely manner or at all (loosely ordered by descending priority): !!
!! _______________________ !!
!! - teaming up on fixing the UNFIXED SECURITY ISSUES listed at: !!
!! """"""""""""""""""""""" !!
!! https://github.com/libexpat/libexpat/issues/1160 !!
!! !!
!! - teaming up on researching and fixing future security reports and !!
!! ClusterFuzz findings with few-days-max response times in communication !!
!! in order to (1) have a sound fix ready before the end of a 90 days !!
!! grace period and (2) in a sustainable manner, !!
!! - helping CPython Expat bindings with supporting Expat's amplification !!
!! attack protection API (https://github.com/python/cpython/issues/90949): !!
!! - XML_SetAllocTrackerActivationThreshold !!
!! - XML_SetAllocTrackerMaximumAmplification !!
!! - XML_SetBillionLaughsAttackProtectionActivationThreshold !!
!! - XML_SetBillionLaughsAttackProtectionMaximumAmplification !!
!! - helping Perl's XML::Parser Expat bindings with supporting Expat's !!
!! security API (https://github.com/cpan-authors/XML-Parser/issues/102): !!
!! - XML_SetAllocTrackerActivationThreshold !!
!! - XML_SetAllocTrackerMaximumAmplification !!
!! - XML_SetBillionLaughsAttackProtectionActivationThreshold !!
!! - XML_SetBillionLaughsAttackProtectionMaximumAmplification !!
!! - XML_SetReparseDeferralEnabled !!
!! !!
!! - implementing and auto-testing XML 1.0r5 support !!
!! (needs discussion before pull requests), !!
!! - smart ideas on fixing the Autotools CMake files generation issue !!
!! without breaking CI (needs discussion before pull requests), !!
!! - pushing migration from `int` to `size_t` further !!
!! including edge-cases test coverage (needs discussion before anything). !!
!! !!
!! For details, please reach out via e-mail to sebastian@pipping.org so we !!
!! can schedule a voice call on the topic, in English or German. !!
!! !!
!! THANK YOU! Sebastian Pipping -- Berlin, 2024-03-09 !!
!! THANK YOU! Sebastian Pipping -- Berlin, 2026-03-17 !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Release 2.7.5 Tue March 17 2026
Security fixes:
#1158 CVE-2026-32776 -- Fix NULL function pointer dereference for
empty external parameter entities; it takes use of both
functions XML_ExternalEntityParserCreate and
XML_SetParamEntityParsing for an application to be
vulnerable.
#1161 #1162 CVE-2026-32777 -- Protect from XML_TOK_INSTANCE_START
infinite loop in function entityValueProcessor; it takes
use of both functions XML_ExternalEntityParserCreate and
XML_SetParamEntityParsing for an application to be
vulnerable.
#1163 CVE-2026-32778 -- Fix NULL dereference in function setContext
on retry after an earlier ouf-of-memory condition; it takes
use of function XML_ParserCreateNS or XML_ParserCreate_MM
for an application to be vulnerable.
#1160 Three more unfixed vulnerabilities left
Other changes:
#1146 #1147 Autotools: Fix condition for symbol versioning check, in
particular when compiling with slibtool (not libtool)
#1156 Address Cppcheck >=2.20.0 warnings
#1153 tests: Make test_buffer_can_grow_to_max work for MinGW on
Ubuntu 24.04
#1157 #1159 Version info bumped from 12:2:11 (libexpat*.so.1.11.2)
to 12:3:11 (libexpat*.so.1.11.3); see https://verbump.de/
for what these numbers do
Infrastructure:
#1148 CI: Fix FreeBSD and Solaris CI
#1149 CI: Bump to WASI SDK 30
#1153 CI: Adapt to breaking changes with Ubuntu 22.04
#1156 CI: Adapt to breaking changes in Cppcheck
Special thanks to:
Berkay Eren Ürün
Christian Ng
Fabio Scaccabarozzi
Francesco Bertolaccini
Mark Brand
Rhodri James
and
AddressSanitizer
Buttercup
OSS-Fuzz / ClusterFuzz
Trail of Bits
Release 2.7.4 Sat January 31 2026
Security fixes:
#1131 CVE-2026-24515 -- Function XML_ExternalEntityParserCreate
failed to copy the encoding handler data passed to
XML_SetUnknownEncodingHandler from the parent to the new
subparser. This can cause a NULL dereference (CWE-476) from
external entities that declare use of an unknown encoding.
The expected impact is denial of service. It takes use of
both functions XML_ExternalEntityParserCreate and
XML_SetUnknownEncodingHandler for an application to be
vulnerable.
#1075 CVE-2026-25210 -- Add missing check for integer overflow
related to buffer size determination in function doContent
Bug fixes:
#1073 lib: Fix missing undoing of group size expansion in doProlog
failure cases
#1107 xmlwf: Fix a memory leak
#1104 WASI: Fix format specifiers for 32bit WASI SDK
Other changes:
#1105 lib: Fix strict aliasing
#1106 lib: Leverage feature "flexible array member" of C99
#1051 lib: Swap (size_t)(-1) for C99 equivalent SIZE_MAX
#1109 lib|xmlwf: Return NULL instead of 0 for pointers
#1068 lib|Windows: Clean up use of macro _MSC_EXTENSIONS with MSVC
#1112 lib: Remove unused import
#1110 xmlwf: Warn about XXE in --help output (and man page)
#1102 #1103 WASI: Stop using getpid
#1113 #1130 Autotools: Drop file expat.m4 that provided obsolete Autoconf
macro AM_WITH_EXPAT
#1123 Autotools: Limit -Wno-pedantic-ms-format to MinGW
#1129 #1134 ..
#1087 Autotools|macOS: Sync CMake templates with CMake 4.0
#1139 #1140 Autotools|CMake: Introduce off-by-default symbol versioning
The related build system flags are:
- For Autotools, configure with --enable-symbol-versioning
- For CMake, configure with -DEXPAT_SYMBOL_VERSIONING=ON
Please double-check for consequences before activating
this inside distro packaging. Bug reports welcome!
#1117 Autotools|CMake: Remove libbsd support
#1105 Autotools|CMake: Stop using -fno-strict-aliasing, and use
-Wstrict-aliasing=3 instead
#1124 Autotools|CMake: Prefer command gsed (GNU sed) over sed
(e.g. for Solaris) inside fix-xmltest-log.sh
#1067 CMake: Detect and warn about unusable check_c_compiler_flag
#1137 CMake: Drop support for CMake <3.17
#1138 CMake|Windows: Fix libexpat.def.cmake version comments
#1086 #1110 docs: Add warning about external reference handlers and XXE
#1066 docs: Be explicit that parent parsers need to outlive
subparsers
#1089 ..
#1090 #1091 ..
#1092 #1093 ..
#1094 #1098 ..
#1115 #1116 docs: Misc non-content improvements to doc/reference.html
#1132 #1133 Version info bumped from 12:1:11 (libexpat*.so.1.11.1)
to 12:2:11 (libexpat*.so.1.11.2); see https://verbump.de/
for what these numbers do
Infrastructure:
#1119 #1121 Document guidelines for contributing to Expat
#1120 Introduce a pull request template
#1074 CI: Stop using about-to-be-removed image "macos-13"
#1083 #1088 CI: Mitigate random Wine crashes
#1104 CI: Cover compilation with WASI SDK
#1116 CI: Enforce clean doc XML formatting
#1124 ..
#1135 #1136 CI: Cover Solaris 11.4
#1125 CI: Extend CI coverage of FreeBSD
#1139 #1140 CI: Cover symbol versioning
#1114 xmlwf: Reformat helpgen code (using Black 25.12.0)
#1071 .gitignore: Add files CPackConfig.cmake and
CPackSourceConfig.cmake
Special thanks to:
Alfonso Gregory
Bénédikt Tran
Gordon Messmer
Hanno Böck
Jakub Kulík
Matthew Fernandez
Neil Pang
Rosen Penev
and
Artiphishell Inc.
Release 2.7.3 Wed September 24 2025
Security fixes:
#1046 #1048 Fix alignment of internal allocations for some non-amd64
+1 -2
View File
@@ -6,7 +6,7 @@
# \___/_/\_\ .__/ \__,_|\__|
# |_| XML parser
#
# Copyright (c) 2017-2025 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2017-2026 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2018 KangLin <kl222@126.com>
# Copyright (c) 2022 Johnny Jazeix <jazeix@gmail.com>
# Copyright (c) 2023 Sony Corporation / Snild Dolkow <snild@sony.com>
@@ -94,7 +94,6 @@ EXTRA_DIST = \
$(_EXTRA_DIST_CMAKE) \
$(_EXTRA_DIST_WINDOWS) \
\
conftools/expat.m4 \
conftools/get-version.sh \
\
fuzz/xml_lpm_fuzzer.cpp \
+4 -2
View File
@@ -22,7 +22,7 @@
# \___/_/\_\ .__/ \__,_|\__|
# |_| XML parser
#
# Copyright (c) 2017-2025 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2017-2026 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2018 KangLin <kl222@126.com>
# Copyright (c) 2022 Johnny Jazeix <jazeix@gmail.com>
# Copyright (c) 2023 Sony Corporation / Snild Dolkow <snild@sony.com>
@@ -395,6 +395,9 @@ SO_MINOR = @SO_MINOR@
SO_PATCH = @SO_PATCH@
STRIP = @STRIP@
VERSION = @VERSION@
VSCRIPT_LDFLAGS = @VSCRIPT_LDFLAGS@
_EXPAT_COMMENT_ATTR_INFO = @_EXPAT_COMMENT_ATTR_INFO@
_EXPAT_COMMENT_DTD_OR_GE = @_EXPAT_COMMENT_DTD_OR_GE@
abs_builddir = @abs_builddir@
abs_srcdir = @abs_srcdir@
abs_top_builddir = @abs_top_builddir@
@@ -497,7 +500,6 @@ EXTRA_DIST = \
$(_EXTRA_DIST_CMAKE) \
$(_EXTRA_DIST_WINDOWS) \
\
conftools/expat.m4 \
conftools/get-version.sh \
\
fuzz/xml_lpm_fuzzer.cpp \
+4 -9
View File
@@ -11,7 +11,7 @@
> at the top of the `Changes` file.
# Expat, Release 2.7.3
# Expat, Release 2.7.5
This is Expat, a C99 library for parsing
[XML 1.0 Fourth Edition](https://www.w3.org/TR/2006/REC-xml-20060816/), started by
@@ -234,11 +234,6 @@ overrides the in-makefile set `DESTDIR`, because variable-setting priority is
Note: This only applies to the Expat library itself, building UTF-16 versions
of xmlwf and the tests is currently not supported.
When using Expat with a project using autoconf for configuration, you
can use the probing macro in `conftools/expat.m4` to determine how to
include Expat. See the comments at the top of that file for more
information.
A reference manual is available in the file `doc/reference.html` in this
distribution.
@@ -297,15 +292,15 @@ EXPAT_OSSFUZZ_BUILD:BOOL=OFF
// Build a shared expat library
EXPAT_SHARED_LIBS:BOOL=ON
// Define to provide symbol versioning for dependency generation
EXPAT_SYMBOL_VERSIONING:BOOL=OFF
// Treat all compiler warnings as errors
EXPAT_WARNINGS_AS_ERRORS:BOOL=OFF
// Make use of getrandom function (ON|OFF|AUTO) [default=AUTO]
EXPAT_WITH_GETRANDOM:STRING=AUTO
// Utilize libbsd (for arc4random_buf)
EXPAT_WITH_LIBBSD:BOOL=OFF
// Make use of syscall SYS_getrandom (ON|OFF|AUTO) [default=AUTO]
EXPAT_WITH_SYS_GETRANDOM:STRING=AUTO
```
+36 -30
View File
@@ -11,7 +11,7 @@ dnl Copyright (c) 2000 Clark Cooper <coopercc@users.sourceforge.net>
dnl Copyright (c) 2000-2005 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
dnl Copyright (c) 2001-2003 Greg Stein <gstein@users.sourceforge.net>
dnl Copyright (c) 2006-2012 Karl Waclawek <karl@waclawek.net>
dnl Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
dnl Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
dnl Copyright (c) 2017 S. P. Zeidler <spz@netbsd.org>
dnl Copyright (c) 2017 Stephen Groat <stephen@groat.us>
dnl Copyright (c) 2017-2020 Joe Orton <jorton@redhat.com>
@@ -25,6 +25,10 @@ dnl Copyright (c) 2020 Jeffrey Walton <noloader@gmail.com>
dnl Copyright (c) 2024 Ferenc Géczi <ferenc.gm@gmail.com>
dnl Copyright (c) 2024 Dag-Erling Smørgrav <des@des.dev>
dnl Copyright (c) 2025 Matthew Fernandez <matthew.fernandez@gmail.com>
dnl Copyright (c) 2025 Alfonso Gregory <gfunni234@gmail.com>
dnl Copyright (c) 2026 Rosen Penev <rosenp@gmail.com>
dnl Copyright (c) 2026 Gordon Messmer <gordon.messmer@gmail.com>
dnl Copyright (c) 2026 Fabio Scaccabarozzi <fsvm88@gmail.com>
dnl Licensed under the MIT license:
dnl
dnl Permission is hereby granted, free of charge, to any person obtaining
@@ -86,7 +90,7 @@ dnl If the API changes incompatibly set LIBAGE back to 0
dnl
LIBCURRENT=12 # sync
LIBREVISION=1 # with
LIBREVISION=3 # with
LIBAGE=11 # CMakeLists.txt!
AC_CONFIG_HEADERS([expat_config.h])
@@ -117,10 +121,12 @@ AS_IF([test "$GCC" = yes],
dnl GCC don't support it and it causes extra warnings that are only
dnl distracting; avoid.
AX_APPEND_COMPILE_FLAGS([-fexceptions], [AM_CFLAGS])
AX_APPEND_COMPILE_FLAGS([-fno-strict-aliasing -Wmissing-prototypes -Wstrict-prototypes], [AM_CFLAGS])
AX_APPEND_COMPILE_FLAGS([-Wstrict-aliasing=3 -Wmissing-prototypes -Wstrict-prototypes], [AM_CFLAGS])
AX_APPEND_COMPILE_FLAGS([-pedantic -Wduplicated-cond -Wduplicated-branches -Wlogical-op], [AM_CFLAGS])
AX_APPEND_COMPILE_FLAGS([-Wrestrict -Wnull-dereference -Wjump-misses-init -Wdouble-promotion], [AM_CFLAGS])
AX_APPEND_COMPILE_FLAGS([-Wshadow -Wformat=2 -Wno-pedantic-ms-format -Wmisleading-indentation], [AM_CFLAGS])])
AX_APPEND_COMPILE_FLAGS([-Wshadow -Wformat=2 -Wmisleading-indentation], [AM_CFLAGS])
AS_CASE(["${host_os}"], [mingw*], [AX_APPEND_COMPILE_FLAGS([-Wno-pedantic-ms-format], [AM_CFLAGS])])
])
AC_LANG_PUSH([C++])
AC_PROG_CXX
@@ -131,11 +137,23 @@ AS_IF([test "$GCC" = yes],
dnl GCC don't support it and it causes extra warnings that are only
dnl distracting; avoid.
AX_APPEND_COMPILE_FLAGS([-fexceptions], [AM_CXXFLAGS])
AX_APPEND_COMPILE_FLAGS([-fno-strict-aliasing], [AM_CXXFLAGS])])
AX_APPEND_COMPILE_FLAGS([-Wstrict-aliasing=3], [AM_CXXFLAGS])])
AC_LANG_POP([C++])
AS_IF([test "$GCC" = yes],
[AX_APPEND_LINK_FLAGS([-fno-strict-aliasing],[AM_LDFLAGS])])
[AX_APPEND_LINK_FLAGS([-Wstrict-aliasing=3],[AM_LDFLAGS])])
AC_ARG_ENABLE([symbol-versioning],
[AS_HELP_STRING([--enable-symbol-versioning],
[provide symbol versioning for dependency generation @<:@default=no@:>@])],
[enable_symbol_versioning=$enableval],
[enable_symbol_versioning=no])
AS_IF([test "x$enable_symbol_versioning" != xno],
[VSCRIPT_LDFLAGS="-Wl,--version-script"
AC_SUBST([VSCRIPT_LDFLAGS])
])
AM_CONDITIONAL([HAVE_VSCRIPT],
[test "x$enable_symbol_versioning" != xno])
dnl patching ${archive_cmds} to affect generation of file "libtool" to fix linking with clang (issue #312)
AS_CASE(["$LD"],[*clang*],
@@ -199,23 +217,9 @@ AM_CONDITIONAL([_INTERNAL_LARGE_SIZE], [echo -- "${CPPFLAGS}${CFLAGS}" | ${FGREP
LT_LIB_M
AC_ARG_WITH([libbsd],
[AS_HELP_STRING([--with-libbsd], [utilize libbsd (for arc4random_buf)])],
[],
[with_libbsd=no])
AS_IF([test "x${with_libbsd}" != xno],
[AC_CHECK_LIB([bsd],
[arc4random_buf],
[],
[AS_IF([test "x${with_libbsd}" = xyes],
[AC_MSG_ERROR([Enforced use of libbsd cannot be satisfied.])])])])
AC_MSG_CHECKING([for arc4random_buf (BSD, libbsd or glibc 2.36+)])
AC_MSG_CHECKING([for arc4random_buf (BSD or glibc 2.36+)])
AC_LINK_IFELSE([AC_LANG_SOURCE([
#if defined(HAVE_LIBBSD)
# include <bsd/stdlib.h>
#else
# include <stdlib.h> /* for arc4random_buf on BSD */
#endif
#include <stdlib.h>
int main(void) {
char dummy[[123]]; // double brackets for m4
arc4random_buf(dummy, 0U);
@@ -226,13 +230,9 @@ AC_LINK_IFELSE([AC_LANG_SOURCE([
AC_MSG_RESULT([yes])],
[AC_MSG_RESULT([no])
AC_MSG_CHECKING([for arc4random (BSD, macOS, libbsd or glibc 2.36+)])
AC_MSG_CHECKING([for arc4random (BSD, macOS, or glibc 2.36+)])
AC_LINK_IFELSE([AC_LANG_SOURCE([
#if defined(HAVE_LIBBSD)
# include <bsd/stdlib.h>
#else
# include <stdlib.h>
#endif
#include <stdlib.h>
int main(void) {
arc4random();
return 0;
@@ -381,9 +381,14 @@ dnl NOTE: The *_TRUE variables read here are Automake conditionals
dnl that are either set to "" when enabled or to "#" when disabled
dnl (because they are used to dynamically comment out certain things)
AS_IF([test "x${enable_xml_attr_info}" = xyes],
[EXPAT_ATTR_INFO=ON],
[EXPAT_ATTR_INFO=OFF])
[EXPAT_ATTR_INFO=ON
_EXPAT_COMMENT_ATTR_INFO=" "],
[EXPAT_ATTR_INFO=OFF
_EXPAT_COMMENT_ATTR_INFO="#"])
AC_SUBST([_EXPAT_COMMENT_ATTR_INFO])
EXPAT_DTD=ON
_EXPAT_COMMENT_DTD_OR_GE=" "
AC_SUBST([_EXPAT_COMMENT_DTD_OR_GE])
AS_IF([test "x${_INTERNAL_LARGE_SIZE_TRUE}" = x],
[EXPAT_LARGE_SIZE=ON],
[EXPAT_LARGE_SIZE=OFF])
@@ -461,6 +466,7 @@ AC_CONFIG_FILES([Makefile]
[doc/Makefile]
[examples/Makefile]
[lib/Makefile]
[lib/libexpat.map]
[tests/Makefile]
[tests/benchmark/Makefile]
[xmlwf/Makefile])
+3
View File
@@ -293,6 +293,9 @@ SO_MINOR = @SO_MINOR@
SO_PATCH = @SO_PATCH@
STRIP = @STRIP@
VERSION = @VERSION@
VSCRIPT_LDFLAGS = @VSCRIPT_LDFLAGS@
_EXPAT_COMMENT_ATTR_INFO = @_EXPAT_COMMENT_ATTR_INFO@
_EXPAT_COMMENT_DTD_OR_GE = @_EXPAT_COMMENT_DTD_OR_GE@
abs_builddir = @abs_builddir@
abs_srcdir = @abs_srcdir@
abs_top_builddir = @abs_top_builddir@
+2814 -1863
View File
File diff suppressed because it is too large Load Diff
+10 -5
View File
@@ -5,7 +5,7 @@
\\$2 \(la\\$1\(ra\\$3
..
.if \n(.g .mso www.tmac
.TH XMLWF 1 "September 24, 2025" "" ""
.TH XMLWF 1 "March 17, 2026" "" ""
.SH NAME
xmlwf \- Determines if an XML document is well-formed
.SH SYNOPSIS
@@ -97,7 +97,7 @@ The amplification factor is calculated as ..
.nf
amplification := (direct + indirect) / direct
amplification := (direct + indirect) / direct
.fi
@@ -105,7 +105,7 @@ The amplification factor is calculated as ..
.nf
amplification := allocated / direct
amplification := allocated / direct
.fi
@@ -235,7 +235,7 @@ the operating system reporting memory in a strange way; there is
not a leak in \fBxmlwf\fR.
.TP
\*(T<\fB\-s\fR\*(T>
Prints an error if the document is not standalone.
Prints an error if the document is not standalone.
A document is standalone if it has no external subset and no
references to parameter entities.
.TP
@@ -261,6 +261,7 @@ page. See also \*(T<\fB\-e\fR\*(T>.
.TP
\*(T<\fB\-x\fR\*(T>
Turns on parsing external entities.
(CAREFUL! This makes xmlwf vulnerable to external entity attacks (XXE).)
Non-validating parsers are not required to resolve external
entities, or even expand entities at all.
@@ -275,6 +276,7 @@ This is an example of an internal entity:
.nf
<!ENTITY vers '1.0.2'>
.fi
And here are some examples of external entities:
@@ -283,6 +285,7 @@ And here are some examples of external entities:
<!ENTITY header SYSTEM "header\-&vers;.xml"> (parsed)
<!ENTITY logo SYSTEM "logo.png" PNG> (unparsed)
.fi
.TP
\*(T<\fB\-\-\fR\*(T>
@@ -293,6 +296,7 @@ starts with a hyphen. For example:
.nf
xmlwf \-\- \-myfile.xml
.fi
will run \fBxmlwf\fR on the file
@@ -307,7 +311,7 @@ input file cannot be opened, \fBxmlwf\fR prints a single
line describing the problem to standard output.
.PP
If the \*(T<\fB\-k\fR\*(T> option is not provided, \fBxmlwf\fR
halts upon encountering a well-formedness or output-file error.
halts upon encountering a well-formedness or output-file error.
If \*(T<\fB\-k\fR\*(T> is provided, \fBxmlwf\fR continues
processing the remaining input files, describing problems found with any of them.
.SH "EXIT STATUS"
@@ -344,6 +348,7 @@ me, I'd like to add this information to this manpage.
The Expat home page: https://libexpat.github.io/
The W3 XML 1.0 specification (fourth edition): https://www.w3.org/TR/2006/REC\-xml\-20060816/
Billion laughs attack: https://en.wikipedia.org/wiki/Billion_laughs_attack
.fi
.SH AUTHOR
This manual page was originally written by Scott Bronson <\*(T<bronson@rinspin.com\*(T>>
+244 -244
View File
@@ -9,7 +9,7 @@
Copyright (c) 2001 Scott Bronson <bronson@rinspin.com>
Copyright (c) 2002-2003 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
Copyright (c) 2009 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016 Ardo van Rangelrooij <ardo@debian.org>
Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2020 Joe Orton <jorton@redhat.com>
@@ -21,7 +21,7 @@
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
<!ENTITY dhfirstname "<firstname>Scott</firstname>">
<!ENTITY dhsurname "<surname>Bronson</surname>">
<!ENTITY dhdate "<date>September 24, 2025</date>">
<!ENTITY dhdate "<date>March 17, 2026</date>">
<!-- Please adjust this^^ date whenever cutting a new release. -->
<!ENTITY dhsection "<manvolnum>1</manvolnum>">
<!ENTITY dhemail "<email>bronson@rinspin.com</email>">
@@ -29,8 +29,8 @@
<!ENTITY dhucpackage "<refentrytitle>XMLWF</refentrytitle>">
<!ENTITY dhpackage "xmlwf">
<!ENTITY debian "<productname>Debian GNU/Linux</productname>">
<!ENTITY gnu "<acronym>GNU</acronym>">
<!ENTITY debian "<productname>Debian &gnu;/Linux</productname>">
]>
<refentry>
@@ -84,73 +84,77 @@
<title>DESCRIPTION</title>
<para>
<command>&dhpackage;</command> uses the Expat library to
determine if an XML document is well-formed. It is
non-validating.
</para>
<para>
If you do not specify any files on the command-line, and you
have a recent version of <command>&dhpackage;</command>, the
input file will be read from standard input.
</para>
<command>&dhpackage;</command> uses the Expat library to
determine if an XML document is well-formed. It is
non-validating.
</para>
<para>
If you do not specify any files on the command-line, and you
have a recent version of <command>&dhpackage;</command>, the
input file will be read from standard input.
</para>
</refsect1>
<refsect1>
<title>WELL-FORMED DOCUMENTS</title>
<para>
A well-formed document must adhere to the
following rules:
</para>
<itemizedlist>
<listitem><para>
The file begins with an XML declaration. For instance,
<literal>&lt;?xml version="1.0" standalone="yes"?&gt;</literal>.
<emphasis>NOTE</emphasis>:
<command>&dhpackage;</command> does not currently
check for a valid XML declaration.
</para></listitem>
<listitem><para>
Every start tag is either empty (&lt;tag/&gt;)
or has a corresponding end tag.
</para></listitem>
<listitem><para>
There is exactly one root element. This element must contain
all other elements in the document. Only comments, white
space, and processing instructions may come after the close
of the root element.
</para></listitem>
<listitem><para>
All elements nest properly.
</para></listitem>
<listitem><para>
All attribute values are enclosed in quotes (either single
or double).
</para></listitem>
<para>
A well-formed document must adhere to the
following rules:
</para>
<itemizedlist>
<listitem>
<para>
The file begins with an XML declaration. For instance,
<literal>&lt;?xml version="1.0" standalone="yes"?&gt;</literal>.
<emphasis>NOTE</emphasis>:
<command>&dhpackage;</command> does not currently
check for a valid XML declaration.
</para>
</listitem>
<listitem>
<para>
Every start tag is either empty (&lt;tag/&gt;)
or has a corresponding end tag.
</para>
</listitem>
<listitem>
<para>
There is exactly one root element. This element must contain
all other elements in the document. Only comments, white
space, and processing instructions may come after the close
of the root element.
</para>
</listitem>
<listitem>
<para>
All elements nest properly.
</para>
</listitem>
<listitem>
<para>
All attribute values are enclosed in quotes (either single
or double).
</para>
</listitem>
</itemizedlist>
<para>
If the document has a DTD, and it strictly complies with that
DTD, then the document is also considered <emphasis>valid</emphasis>.
<command>&dhpackage;</command> is a non-validating parser --
it does not check the DTD. However, it does support
external entities (see the <option>-x</option> option).
</para>
<para>
If the document has a DTD, and it strictly complies with that
DTD, then the document is also considered <emphasis>valid</emphasis>.
<command>&dhpackage;</command> is a non-validating parser --
it does not check the DTD. However, it does support
external entities (see the <option>-x</option> option).
</para>
</refsect1>
<refsect1>
<title>OPTIONS</title>
<para>
When an option includes an argument, you may specify the argument either
separately ("<option>-d</option> <replaceable>output</replaceable>") or concatenated with the
option ("<option>-d</option><replaceable>output</replaceable>"). <command>&dhpackage;</command>
supports both.
</para>
<para>
When an option includes an argument, you may specify the argument either
separately ("<option>-d</option> <replaceable>output</replaceable>") or concatenated with the
option ("<option>-d</option><replaceable>output</replaceable>"). <command>&dhpackage;</command>
supports both.
</para>
<variablelist>
<varlistentry>
@@ -166,13 +170,13 @@ supports both.
The amplification factor is calculated as ..
</para>
<literallayout>
amplification := (direct + indirect) / direct
amplification := (direct + indirect) / direct
</literallayout>
<para>
.. with regard to use of entities and ..
</para>
<literallayout>
amplification := allocated / direct
amplification := allocated / direct
</literallayout>
<para>
.. with regard to dynamic memory while parsing.
@@ -214,60 +218,60 @@ supports both.
<varlistentry>
<term><option>-c</option></term>
<listitem>
<para>
If the input file is well-formed and <command>&dhpackage;</command>
doesn't encounter any errors, the input file is simply copied to
the output directory unchanged.
This implies no namespaces (turns off <option>-n</option>) and
requires <option>-d</option> to specify an output directory.
</para>
<para>
If the input file is well-formed and <command>&dhpackage;</command>
doesn't encounter any errors, the input file is simply copied to
the output directory unchanged.
This implies no namespaces (turns off <option>-n</option>) and
requires <option>-d</option> to specify an output directory.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-d</option> <replaceable>output-dir</replaceable></term>
<listitem>
<para>
Specifies a directory to contain transformed
representations of the input files.
By default, <option>-d</option> outputs a canonical representation
(described below).
You can select different output formats using <option>-c</option>,
<option>-m</option> and <option>-N</option>.
</para>
<para>
The output filenames will
be exactly the same as the input filenames or "STDIN" if the input is
coming from standard input. Therefore, you must be careful that the
output file does not go into the same directory as the input
file. Otherwise, <command>&dhpackage;</command> will delete the
input file before it generates the output file (just like running
<literal>cat &lt; file &gt; file</literal> in most shells).
</para>
<para>
Two structurally equivalent XML documents have a byte-for-byte
identical canonical XML representation.
Note that ignorable white space is considered significant and
is treated equivalently to data.
More on canonical XML can be found at
http://www.jclark.com/xml/canonxml.html .
</para>
<para>
Specifies a directory to contain transformed
representations of the input files.
By default, <option>-d</option> outputs a canonical representation
(described below).
You can select different output formats using <option>-c</option>,
<option>-m</option> and <option>-N</option>.
</para>
<para>
The output filenames will
be exactly the same as the input filenames or "STDIN" if the input is
coming from standard input. Therefore, you must be careful that the
output file does not go into the same directory as the input
file. Otherwise, <command>&dhpackage;</command> will delete the
input file before it generates the output file (just like running
<literal>cat &lt; file &gt; file</literal> in most shells).
</para>
<para>
Two structurally equivalent XML documents have a byte-for-byte
identical canonical XML representation.
Note that ignorable white space is considered significant and
is treated equivalently to data.
More on canonical XML can be found at
http://www.jclark.com/xml/canonxml.html .
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-e</option> <replaceable>encoding</replaceable></term>
<listitem>
<para>
Specifies the character encoding for the document, overriding
any document encoding declaration. <command>&dhpackage;</command>
supports four built-in encodings:
<literal>US-ASCII</literal>,
<literal>UTF-8</literal>,
<literal>UTF-16</literal>, and
<literal>ISO-8859-1</literal>.
Also see the <option>-w</option> option.
</para>
<para>
Specifies the character encoding for the document, overriding
any document encoding declaration. <command>&dhpackage;</command>
supports four built-in encodings:
<literal>US-ASCII</literal>,
<literal>UTF-8</literal>,
<literal>UTF-16</literal>, and
<literal>ISO-8859-1</literal>.
Also see the <option>-w</option> option.
</para>
</listitem>
</varlistentry>
@@ -312,21 +316,21 @@ supports both.
<varlistentry>
<term><option>-m</option></term>
<listitem>
<para>
Outputs some strange sort of XML file that completely
describes the input file, including character positions.
Requires <option>-d</option> to specify an output file.
</para>
<para>
Outputs some strange sort of XML file that completely
describes the input file, including character positions.
Requires <option>-d</option> to specify an output file.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-n</option></term>
<listitem>
<para>
Turns on namespace processing. (describe namespaces)
<option>-c</option> disables namespaces.
</para>
<para>
Turns on namespace processing. (describe namespaces)
<option>-c</option> disables namespaces.
</para>
</listitem>
</varlistentry>
@@ -334,9 +338,9 @@ supports both.
<term><option>-N</option></term>
<listitem>
<para>
Adds a doctype and notation declarations to canonical XML output.
This matches the example output used by the formal XML test cases.
Requires <option>-d</option> to specify an output file.
Adds a doctype and notation declarations to canonical XML output.
This matches the example output used by the formal XML test cases.
Requires <option>-d</option> to specify an output file.
</para>
</listitem>
</varlistentry>
@@ -344,15 +348,15 @@ supports both.
<varlistentry>
<term><option>-p</option></term>
<listitem>
<para>
Tells <command>&dhpackage;</command> to process external DTDs and parameter
entities.
</para>
<para>
Normally <command>&dhpackage;</command> never parses parameter
entities. <option>-p</option> tells it to always parse them.
<option>-p</option> implies <option>-x</option>.
</para>
<para>
Tells <command>&dhpackage;</command> to process external DTDs and parameter
entities.
</para>
<para>
Normally <command>&dhpackage;</command> never parses parameter
entities. <option>-p</option> tells it to always parse them.
<option>-p</option> implies <option>-x</option>.
</para>
</listitem>
</varlistentry>
@@ -369,47 +373,47 @@ supports both.
<varlistentry>
<term><option>-r</option></term>
<listitem>
<para>
Normally <command>&dhpackage;</command> memory-maps the XML file
before parsing; this can result in faster parsing on many
platforms.
<option>-r</option> turns off memory-mapping and uses normal file
IO calls instead.
Of course, memory-mapping is automatically turned off
when reading from standard input.
</para>
<para>
Use of memory-mapping can cause some platforms to report
substantially higher memory usage for
<command>&dhpackage;</command>, but this appears to be a matter of
the operating system reporting memory in a strange way; there is
not a leak in <command>&dhpackage;</command>.
</para>
<para>
Normally <command>&dhpackage;</command> memory-maps the XML file
before parsing; this can result in faster parsing on many
platforms.
<option>-r</option> turns off memory-mapping and uses normal file
IO calls instead.
Of course, memory-mapping is automatically turned off
when reading from standard input.
</para>
<para>
Use of memory-mapping can cause some platforms to report
substantially higher memory usage for
<command>&dhpackage;</command>, but this appears to be a matter of
the operating system reporting memory in a strange way; there is
not a leak in <command>&dhpackage;</command>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-s</option></term>
<listitem>
<para>
Prints an error if the document is not standalone.
A document is standalone if it has no external subset and no
references to parameter entities.
</para>
<para>
Prints an error if the document is not standalone.
A document is standalone if it has no external subset and no
references to parameter entities.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-t</option></term>
<listitem>
<para>
Turns on timings. This tells Expat to parse the entire file,
but not perform any processing.
This gives a fairly accurate idea of the raw speed of Expat itself
without client overhead.
<option>-t</option> turns off most of the output options
(<option>-d</option>, <option>-m</option>, <option>-c</option>, ...).
</para>
<para>
Turns on timings. This tells Expat to parse the entire file,
but not perform any processing.
This gives a fairly accurate idea of the raw speed of Expat itself
without client overhead.
<option>-t</option> turns off most of the output options
(<option>-d</option>, <option>-m</option>, <option>-c</option>, ...).
</para>
</listitem>
</varlistentry>
@@ -417,104 +421,102 @@ supports both.
<term><option>-v</option></term>
<term><option>--version</option></term>
<listitem>
<para>
Prints the version of the Expat library being used, including some
information on the compile-time configuration of the library, and
then exits.
</para>
<para>
Prints the version of the Expat library being used, including some
information on the compile-time configuration of the library, and
then exits.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-w</option></term>
<listitem>
<para>
Enables support for Windows code pages.
Normally, <command>&dhpackage;</command> will throw an error if it
runs across an encoding that it is not equipped to handle itself. With
<option>-w</option>, <command>&dhpackage;</command> will try to use a Windows code
page. See also <option>-e</option>.
</para>
<para>
Enables support for Windows code pages.
Normally, <command>&dhpackage;</command> will throw an error if it
runs across an encoding that it is not equipped to handle itself. With
<option>-w</option>, <command>&dhpackage;</command> will try to use a Windows code
page. See also <option>-e</option>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-x</option></term>
<listitem>
<para>
Turns on parsing external entities.
</para>
<para>
Non-validating parsers are not required to resolve external
entities, or even expand entities at all.
Expat always expands internal entities (?),
but external entity parsing must be enabled explicitly.
</para>
<para>
External entities are simply entities that obtain their
data from outside the XML file currently being parsed.
</para>
<para>
This is an example of an internal entity:
<literallayout>
<para>
Turns on parsing external entities.
(CAREFUL! This makes xmlwf vulnerable to external entity attacks (XXE).)
</para>
<para>
Non-validating parsers are not required to resolve external
entities, or even expand entities at all.
Expat always expands internal entities (?),
but external entity parsing must be enabled explicitly.
</para>
<para>
External entities are simply entities that obtain their
data from outside the XML file currently being parsed.
</para>
<para>
This is an example of an internal entity:
<literallayout>
&lt;!ENTITY vers '1.0.2'&gt;
</literallayout>
</para>
<para>
And here are some examples of external entities:
</literallayout>
</para>
<para>
And here are some examples of external entities:
<literallayout>
<literallayout>
&lt;!ENTITY header SYSTEM "header-&amp;vers;.xml"&gt; (parsed)
&lt;!ENTITY logo SYSTEM "logo.png" PNG&gt; (unparsed)
</literallayout>
</para>
</literallayout>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--</option></term>
<listitem>
<para>
(Two hyphens.)
Terminates the list of options. This is only needed if a filename
starts with a hyphen. For example:
</para>
<literallayout>
<para>
(Two hyphens.)
Terminates the list of options. This is only needed if a filename
starts with a hyphen. For example:
</para>
<literallayout>
&dhpackage; -- -myfile.xml
</literallayout>
<para>
will run <command>&dhpackage;</command> on the file
<filename>-myfile.xml</filename>.
</para>
</literallayout>
<para>
will run <command>&dhpackage;</command> on the file
<filename>-myfile.xml</filename>.
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
Older versions of <command>&dhpackage;</command> do not support
reading from standard input.
</para>
</refsect1>
<refsect1>
<title>OUTPUT</title>
<para>
<command>&dhpackage;</command> outputs nothing for files which are problem-free.
If any input file is not well-formed, or if the output for any
input file cannot be opened, <command>&dhpackage;</command> prints a single
line describing the problem to standard output.
</para>
<para>
If the <option>-k</option> option is not provided, <command>&dhpackage;</command>
halts upon encountering a well-formedness or output-file error.
If <option>-k</option> is provided, <command>&dhpackage;</command> continues
processing the remaining input files, describing problems found with any of them.
Older versions of <command>&dhpackage;</command> do not support
reading from standard input.
</para>
</refsect1>
<refsect1>
<title>EXIT STATUS</title>
<title>OUTPUT</title>
<para><command>&dhpackage;</command> outputs nothing for files which are problem-free.
If any input file is not well-formed, or if the output for any
input file cannot be opened, <command>&dhpackage;</command> prints a single
line describing the problem to standard output.
</para>
<para>
If the <option>-k</option> option is not provided, <command>&dhpackage;</command>
halts upon encountering a well-formedness or output-file error.
If <option>-k</option> is provided, <command>&dhpackage;</command> continues
processing the remaining input files, describing problems found with any of them.
</para>
</refsect1>
<refsect1>
<title>EXIT STATUS</title>
<para>For options <option>-v</option>|<option>--version</option> or <option>-h</option>|<option>--help</option>, <command>&dhpackage;</command> always exits with status code 0. For other cases, the following exit status codes are returned:
<variablelist>
<varlistentry>
@@ -543,39 +545,37 @@ supports both.
</listitem>
</varlistentry>
</variablelist>
</para>
</para>
</refsect1>
<refsect1>
<title>BUGS</title>
<para>
The errors should go to standard error, not standard output.
</para>
<para>
There should be a way to get <option>-d</option> to send its
output to standard output rather than forcing the user to send
it to a file.
</para>
<para>
I have no idea why anyone would want to use the
<option>-d</option>, <option>-c</option>, and
<option>-m</option> options. If someone could explain it to
me, I'd like to add this information to this manpage.
</para>
<para>
The errors should go to standard error, not standard output.
</para>
<para>
There should be a way to get <option>-d</option> to send its
output to standard output rather than forcing the user to send
it to a file.
</para>
<para>
I have no idea why anyone would want to use the
<option>-d</option>, <option>-c</option>, and
<option>-m</option> options. If someone could explain it to
me, I'd like to add this information to this manpage.
</para>
</refsect1>
<refsect1>
<title>SEE ALSO</title>
<para>
<literallayout>
<para>
<literallayout>
The Expat home page: https://libexpat.github.io/
The W3 XML 1.0 specification (fourth edition): https://www.w3.org/TR/2006/REC-xml-20060816/
Billion laughs attack: https://en.wikipedia.org/wiki/Billion_laughs_attack
</literallayout>
</para>
</literallayout>
</para>
</refsect1>
<refsect1>
@@ -585,8 +585,8 @@ Billion laughs attack: https://en.wikipedia.org/wiki/Bi
in December 2001 for
the &debian; system (but may be used by others). Permission is
granted to copy, distribute and/or modify this document under
the terms of the <acronym>GNU</acronym> Free Documentation
the terms of the &gnu; Free Documentation
License, Version 1.1.
</para>
</para>
</refsect1>
</refentry>
+3
View File
@@ -321,6 +321,9 @@ SO_MINOR = @SO_MINOR@
SO_PATCH = @SO_PATCH@
STRIP = @STRIP@
VERSION = @VERSION@
VSCRIPT_LDFLAGS = @VSCRIPT_LDFLAGS@
_EXPAT_COMMENT_ATTR_INFO = @_EXPAT_COMMENT_ATTR_INFO@
_EXPAT_COMMENT_DTD_OR_GE = @_EXPAT_COMMENT_DTD_OR_GE@
abs_builddir = @abs_builddir@
abs_srcdir = @abs_srcdir@
abs_top_builddir = @abs_top_builddir@
-3
View File
@@ -33,9 +33,6 @@
/* Define to 1 if you have the <inttypes.h> header file. */
#undef HAVE_INTTYPES_H
/* Define to 1 if you have the 'bsd' library (-lbsd). */
#undef HAVE_LIBBSD
/* Define to 1 if you have a working 'mmap' system call. */
#undef HAVE_MMAP
+3 -2
View File
@@ -6,7 +6,7 @@
# \___/_/\_\ .__/ \__,_|\__|
# |_| XML parser
#
# Copyright (c) 2019-2022 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2019-2026 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2024 Dag-Erling Smørgrav <des@des.dev>
# Licensed under the MIT license:
#
@@ -31,9 +31,10 @@
set -e
sed="$(type -P gsed sed false | head -n 1)" # e.g. for Solaris
filename="${1:-tests/xmltest.log}"
sed -i.bak \
exec "${sed}" -i.bak \
-e '# convert DOS line endings to Unix without resorting to dos2unix' \
-e $'s/\r//' \
\
+5 -1
View File
@@ -6,9 +6,10 @@
# \___/_/\_\ .__/ \__,_|\__|
# |_| XML parser
#
# Copyright (c) 2017-2024 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2017-2026 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2017 Tomasz Kłoczko <kloczek@fedoraproject.org>
# Copyright (c) 2019 David Loffredo <loffredo@steptools.com>
# Copyright (c) 2026 Gordon Messmer <gordon.messmer@gmail.com>
# Licensed under the MIT license:
#
# Permission is hereby granted, free of charge, to any person obtaining
@@ -45,6 +46,9 @@ libexpat_la_LDFLAGS = \
@LIBM@ \
-no-undefined \
-version-info @LIBCURRENT@:@LIBREVISION@:@LIBAGE@
if HAVE_VSCRIPT
libexpat_la_LDFLAGS += $(VSCRIPT_LDFLAGS),@builddir@/libexpat.map
endif
libexpat_la_SOURCES = \
xmlparse.c \
+12 -9
View File
@@ -22,9 +22,10 @@
# \___/_/\_\ .__/ \__,_|\__|
# |_| XML parser
#
# Copyright (c) 2017-2024 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2017-2026 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2017 Tomasz Kłoczko <kloczek@fedoraproject.org>
# Copyright (c) 2019 David Loffredo <loffredo@steptools.com>
# Copyright (c) 2026 Gordon Messmer <gordon.messmer@gmail.com>
# Licensed under the MIT license:
#
# Permission is hereby granted, free of charge, to any person obtaining
@@ -124,6 +125,7 @@ PRE_UNINSTALL = :
POST_UNINSTALL = :
build_triplet = @build@
host_triplet = @host@
@HAVE_VSCRIPT_TRUE@am__append_1 = $(VSCRIPT_LDFLAGS),@builddir@/libexpat.map
subdir = lib
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
am__aclocal_m4_deps = $(top_srcdir)/m4/libtool.m4 \
@@ -146,7 +148,7 @@ DIST_COMMON = $(srcdir)/Makefile.am $(include_HEADERS) \
$(am__DIST_COMMON)
mkinstalldirs = $(install_sh) -d
CONFIG_HEADER = $(top_builddir)/expat_config.h
CONFIG_CLEAN_FILES =
CONFIG_CLEAN_FILES = libexpat.map
CONFIG_CLEAN_VPATH_FILES =
am__vpath_adj_setup = srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`;
am__vpath_adj = case $$p in \
@@ -259,7 +261,7 @@ am__define_uniq_tagged_files = \
unique=`for i in $$list; do \
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
done | $(am__uniquify_input)`
am__DIST_COMMON = $(srcdir)/Makefile.in \
am__DIST_COMMON = $(srcdir)/Makefile.in $(srcdir)/libexpat.map.in \
$(top_srcdir)/conftools/depcomp
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
ACLOCAL = @ACLOCAL@
@@ -358,6 +360,9 @@ SO_MINOR = @SO_MINOR@
SO_PATCH = @SO_PATCH@
STRIP = @STRIP@
VERSION = @VERSION@
VSCRIPT_LDFLAGS = @VSCRIPT_LDFLAGS@
_EXPAT_COMMENT_ATTR_INFO = @_EXPAT_COMMENT_ATTR_INFO@
_EXPAT_COMMENT_DTD_OR_GE = @_EXPAT_COMMENT_DTD_OR_GE@
abs_builddir = @abs_builddir@
abs_srcdir = @abs_srcdir@
abs_top_builddir = @abs_top_builddir@
@@ -421,12 +426,8 @@ include_HEADERS = \
lib_LTLIBRARIES = libexpat.la
@WITH_TESTS_TRUE@noinst_LTLIBRARIES = libtestpat.la
libexpat_la_LDFLAGS = \
@AM_LDFLAGS@ \
@LIBM@ \
-no-undefined \
-version-info @LIBCURRENT@:@LIBREVISION@:@LIBAGE@
libexpat_la_LDFLAGS = @AM_LDFLAGS@ @LIBM@ -no-undefined -version-info \
@LIBCURRENT@:@LIBREVISION@:@LIBAGE@ $(am__append_1)
libexpat_la_SOURCES = \
xmlparse.c \
xmltok.c \
@@ -490,6 +491,8 @@ $(top_srcdir)/configure: @MAINTAINER_MODE_TRUE@ $(am__configure_deps)
$(ACLOCAL_M4): @MAINTAINER_MODE_TRUE@ $(am__aclocal_m4_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(am__aclocal_m4_deps):
libexpat.map: $(top_builddir)/config.status $(srcdir)/libexpat.map.in
cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@
install-libLTLIBRARIES: $(lib_LTLIBRARIES)
@$(NORMAL_INSTALL)
+2 -2
View File
@@ -11,7 +11,7 @@
Copyright (c) 2000-2005 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
Copyright (c) 2001-2002 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2002-2016 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016 Cristian Rodríguez <crrodriguez@opensuse.org>
Copyright (c) 2016 Thomas Beutlich <tc@tbeu.de>
Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
@@ -1082,7 +1082,7 @@ XML_SetReparseDeferralEnabled(XML_Parser parser, XML_Bool enabled);
*/
# define XML_MAJOR_VERSION 2
# define XML_MINOR_VERSION 7
# define XML_MICRO_VERSION 3
# define XML_MICRO_VERSION 5
# ifdef __cplusplus
}
+2 -3
View File
@@ -12,7 +12,7 @@
Copyright (c) 2001-2002 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2002-2006 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2016 Cristian Rodríguez <crrodriguez@opensuse.org>
Copyright (c) 2016-2019 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2018 Yury Gribov <tetra2005@gmail.com>
Licensed under the MIT license:
@@ -88,8 +88,7 @@
# ifndef XML_BUILDING_EXPAT
/* using Expat from an application */
# if defined(_MSC_EXTENSIONS) && ! defined(__BEOS__) \
&& ! defined(__CYGWIN__)
# if defined(_MSC_VER) && ! defined(__BEOS__) && ! defined(__CYGWIN__)
# define XMLIMPORT __declspec(dllimport)
# endif
+1 -1
View File
@@ -128,7 +128,7 @@
# elif ULONG_MAX == 18446744073709551615u // 2^64-1
# define EXPAT_FMT_PTRDIFF_T(midpart) "%" midpart "ld"
# define EXPAT_FMT_SIZE_T(midpart) "%" midpart "lu"
# elif defined(EMSCRIPTEN) // 32bit mode Emscripten
# elif defined(__wasm32__) // 32bit mode Emscripten or WASI SDK
# define EXPAT_FMT_PTRDIFF_T(midpart) "%" midpart "ld"
# define EXPAT_FMT_SIZE_T(midpart) "%" midpart "zu"
# else
+119
View File
@@ -0,0 +1,119 @@
LIBEXPAT_1.0.0 {
global:
XML_DefaultCurrent;
XML_ErrorString;
XML_ExternalEntityParserCreate;
XML_GetBase;
XML_GetBuffer;
XML_GetCurrentByteIndex;
XML_GetCurrentColumnNumber;
XML_GetCurrentLineNumber;
XML_GetErrorCode;
XML_Parse;
XML_ParseBuffer;
XML_ParserCreate;
XML_ParserFree;
XML_SetBase;
XML_SetCharacterDataHandler;
XML_SetDefaultHandler;
XML_SetElementHandler;
XML_SetExternalEntityRefHandler;
XML_SetNotationDeclHandler;
XML_SetProcessingInstructionHandler;
XML_SetUnknownEncodingHandler;
XML_SetUnparsedEntityDeclHandler;
XML_SetUserData;
XML_UseParserAsHandlerArg;
};
LIBEXPAT_1.1.0 {
global:
XML_GetCurrentByteCount;
XML_GetSpecifiedAttributeCount;
XML_ParserCreateNS;
XML_SetCdataSectionHandler;
XML_SetCommentHandler;
XML_SetDefaultHandlerExpand;
XML_SetEncoding;
XML_SetExternalEntityRefHandlerArg;
XML_SetNamespaceDeclHandler;
XML_SetNotStandaloneHandler;
} LIBEXPAT_1.0.0;
LIBEXPAT_1.95.0 {
global:
XML_ExpatVersion;
XML_GetIdAttributeIndex;
XML_GetInputContext;
XML_ParserCreate_MM;
XML_SetAttlistDeclHandler;
XML_SetDoctypeDeclHandler;
XML_SetElementDeclHandler;
XML_SetEndCdataSectionHandler;
XML_SetEndDoctypeDeclHandler;
XML_SetEndElementHandler;
XML_SetEndNamespaceDeclHandler;
XML_SetEntityDeclHandler;
XML_SetParamEntityParsing;
XML_SetReturnNSTriplet;
XML_SetStartCdataSectionHandler;
XML_SetStartDoctypeDeclHandler;
XML_SetStartElementHandler;
XML_SetStartNamespaceDeclHandler;
XML_SetXmlDeclHandler;
} LIBEXPAT_1.1.0;
LIBEXPAT_1.95.3 {
global:
XML_ExpatVersionInfo;
XML_ParserReset;
} LIBEXPAT_1.95.0;
LIBEXPAT_1.95.4 {
global:
XML_SetSkippedEntityHandler;
} LIBEXPAT_1.95.3;
LIBEXPAT_1.95.5 {
global:
XML_GetFeatureList;
XML_UseForeignDTD;
} LIBEXPAT_1.95.4;
LIBEXPAT_1.95.6 {
global:
XML_FreeContentModel;
XML_MemFree;
XML_MemMalloc;
XML_MemRealloc;
} LIBEXPAT_1.95.5;
LIBEXPAT_1.95.8 {
global:
XML_GetParsingStatus;
XML_ResumeParser;
XML_StopParser;
} LIBEXPAT_1.95.6;
LIBEXPAT_2.1.0 {
global:
@_EXPAT_COMMENT_ATTR_INFO@ XML_GetAttributeInfo;
XML_SetHashSalt;
} LIBEXPAT_1.95.8;
LIBEXPAT_2.4.0 {
global:
@_EXPAT_COMMENT_DTD_OR_GE@ XML_SetBillionLaughsAttackProtectionActivationThreshold;
@_EXPAT_COMMENT_DTD_OR_GE@ XML_SetBillionLaughsAttackProtectionMaximumAmplification;
} LIBEXPAT_2.1.0;
LIBEXPAT_2.6.0 {
global:
XML_SetReparseDeferralEnabled;
} LIBEXPAT_2.4.0;
LIBEXPAT_2.7.2 {
global:
@_EXPAT_COMMENT_DTD_OR_GE@ XML_SetAllocTrackerActivationThreshold;
@_EXPAT_COMMENT_DTD_OR_GE@ XML_SetAllocTrackerMaximumAmplification;
} LIBEXPAT_2.6.0;
+112 -61
View File
@@ -1,4 +1,4 @@
/* 28bcd8b1ba7eb595d82822908257fd9c3589b4243e3c922d0369f35bfcd7b506 (2.7.3+)
/* 93c1caa66e2b0310459482516af05505b57c5cb7b96df777105308fc585c85d1 (2.7.5+)
__ __ _
___\ \/ /_ __ __ _| |_
/ _ \\ /| '_ \ / _` | __|
@@ -13,7 +13,7 @@
Copyright (c) 2002-2016 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2005-2009 Steven Solie <steven@solie.ca>
Copyright (c) 2016 Eric Rahm <erahm@mozilla.com>
Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016 Gaurav <g.gupta@samsung.com>
Copyright (c) 2016 Thomas Beutlich <tc@tbeu.de>
Copyright (c) 2016 Gustavo Grieco <gustavo.grieco@imag.fr>
@@ -42,6 +42,9 @@
Copyright (c) 2024-2025 Berkay Eren Ürün <berkay.ueruen@siemens.com>
Copyright (c) 2024 Hanno Böck <hanno@gentoo.org>
Copyright (c) 2025 Matthew Fernandez <matthew.fernandez@gmail.com>
Copyright (c) 2025 Atrem Borovik <polzovatellllk@gmail.com>
Copyright (c) 2025 Alfonso Gregory <gfunni234@gmail.com>
Copyright (c) 2026 Rosen Penev <rosenp@gmail.com>
Licensed under the MIT license:
Permission is hereby granted, free of charge, to any person obtaining
@@ -101,7 +104,7 @@
#include <limits.h> /* INT_MAX, UINT_MAX */
#include <stdio.h> /* fprintf */
#include <stdlib.h> /* getenv, rand_s */
#include <stdint.h> /* uintptr_t */
#include <stdint.h> /* SIZE_MAX, uintptr_t */
#include <math.h> /* isnan */
#ifdef _WIN32
@@ -134,11 +137,6 @@
# endif /* defined(GRND_NONBLOCK) */
#endif /* defined(HAVE_GETRANDOM) || defined(HAVE_SYSCALL_GETRANDOM) */
#if defined(HAVE_LIBBSD) \
&& (defined(HAVE_ARC4RANDOM_BUF) || defined(HAVE_ARC4RANDOM))
# include <bsd/stdlib.h>
#endif
#if defined(_WIN32) && ! defined(LOAD_LIBRARY_SEARCH_SYSTEM32)
# define LOAD_LIBRARY_SEARCH_SYSTEM32 0x00000800
#endif
@@ -155,8 +153,6 @@
* Linux >=3.17 + glibc (including <2.25) (syscall SYS_getrandom): HAVE_SYSCALL_GETRANDOM, \
* BSD / macOS >=10.7 / glibc >=2.36 (arc4random_buf): HAVE_ARC4RANDOM_BUF, \
* BSD / macOS (including <10.7) / glibc >=2.36 (arc4random): HAVE_ARC4RANDOM, \
* libbsd (arc4random_buf): HAVE_ARC4RANDOM_BUF + HAVE_LIBBSD, \
* libbsd (arc4random): HAVE_ARC4RANDOM + HAVE_LIBBSD, \
* Linux (including <3.17) / BSD / macOS (including <10.7) / Solaris >=8 (/dev/urandom): XML_DEV_URANDOM, \
* Windows >=Vista (rand_s): _WIN32. \
\
@@ -311,8 +307,11 @@ typedef struct tag {
const char *rawName; /* tagName in the original encoding */
int rawNameLength;
TAG_NAME name; /* tagName in the API encoding */
char *buf; /* buffer for name components */
char *bufEnd; /* end of the buffer */
union {
char *raw; /* for byte-level access (rawName storage) */
XML_Char *str; /* for character-level access (converted name) */
} buf; /* buffer for name components */
char *bufEnd; /* end of the buffer */
BINDING *bindings;
} TAG;
@@ -349,7 +348,7 @@ typedef struct {
typedef struct block {
struct block *next;
int size;
XML_Char s[1];
XML_Char s[];
} BLOCK;
typedef struct {
@@ -591,6 +590,8 @@ static XML_Char *poolStoreString(STRING_POOL *pool, const ENCODING *enc,
static XML_Bool FASTCALL poolGrow(STRING_POOL *pool);
static const XML_Char *FASTCALL poolCopyString(STRING_POOL *pool,
const XML_Char *s);
static const XML_Char *FASTCALL poolCopyStringNoFinish(STRING_POOL *pool,
const XML_Char *s);
static const XML_Char *poolCopyStringN(STRING_POOL *pool, const XML_Char *s,
int n);
static const XML_Char *FASTCALL poolAppendString(STRING_POOL *pool,
@@ -1230,8 +1231,11 @@ generate_hash_secret_salt(XML_Parser parser) {
# endif /* ! defined(_WIN32) && defined(XML_DEV_URANDOM) */
/* .. and self-made low quality for backup: */
entropy = gather_time_entropy();
# if ! defined(__wasi__)
/* Process ID is 0 bits entropy if attacker has local access */
entropy = gather_time_entropy() ^ getpid();
entropy ^= getpid();
# endif
/* Factors are 2^31-1 and 2^61-1 (Mersenne primes M31 and M61) */
if (sizeof(unsigned long) == 4) {
@@ -1754,6 +1758,7 @@ XML_ExternalEntityParserCreate(XML_Parser oldParser, const XML_Char *context,
XML_ExternalEntityRefHandler oldExternalEntityRefHandler;
XML_SkippedEntityHandler oldSkippedEntityHandler;
XML_UnknownEncodingHandler oldUnknownEncodingHandler;
void *oldUnknownEncodingHandlerData;
XML_ElementDeclHandler oldElementDeclHandler;
XML_AttlistDeclHandler oldAttlistDeclHandler;
XML_EntityDeclHandler oldEntityDeclHandler;
@@ -1799,6 +1804,7 @@ XML_ExternalEntityParserCreate(XML_Parser oldParser, const XML_Char *context,
oldExternalEntityRefHandler = parser->m_externalEntityRefHandler;
oldSkippedEntityHandler = parser->m_skippedEntityHandler;
oldUnknownEncodingHandler = parser->m_unknownEncodingHandler;
oldUnknownEncodingHandlerData = parser->m_unknownEncodingHandlerData;
oldElementDeclHandler = parser->m_elementDeclHandler;
oldAttlistDeclHandler = parser->m_attlistDeclHandler;
oldEntityDeclHandler = parser->m_entityDeclHandler;
@@ -1859,6 +1865,7 @@ XML_ExternalEntityParserCreate(XML_Parser oldParser, const XML_Char *context,
parser->m_externalEntityRefHandler = oldExternalEntityRefHandler;
parser->m_skippedEntityHandler = oldSkippedEntityHandler;
parser->m_unknownEncodingHandler = oldUnknownEncodingHandler;
parser->m_unknownEncodingHandlerData = oldUnknownEncodingHandlerData;
parser->m_elementDeclHandler = oldElementDeclHandler;
parser->m_attlistDeclHandler = oldAttlistDeclHandler;
parser->m_entityDeclHandler = oldEntityDeclHandler;
@@ -1934,7 +1941,7 @@ XML_ParserFree(XML_Parser parser) {
}
p = tagList;
tagList = tagList->parent;
FREE(parser, p->buf);
FREE(parser, p->buf.raw);
destroyBindings(p->bindings, parser);
FREE(parser, p);
}
@@ -2599,7 +2606,7 @@ XML_GetBuffer(XML_Parser parser, int len) {
// NOTE: We are avoiding MALLOC(..) here to leave limiting
// the input size to the application using Expat.
newBuf = parser->m_mem.malloc_fcn(bufferSize);
if (newBuf == 0) {
if (newBuf == NULL) {
parser->m_errorCode = XML_ERROR_NO_MEMORY;
return NULL;
}
@@ -3126,7 +3133,7 @@ storeRawNames(XML_Parser parser) {
size_t bufSize;
size_t nameLen = sizeof(XML_Char) * (tag->name.strLen + 1);
size_t rawNameLen;
char *rawNameBuf = tag->buf + nameLen;
char *rawNameBuf = tag->buf.raw + nameLen;
/* Stop if already stored. Since m_tagStack is a stack, we can stop
at the first entry that has already been copied; everything
below it in the stack is already been accounted for in a
@@ -3142,22 +3149,22 @@ storeRawNames(XML_Parser parser) {
if (rawNameLen > (size_t)INT_MAX - nameLen)
return XML_FALSE;
bufSize = nameLen + rawNameLen;
if (bufSize > (size_t)(tag->bufEnd - tag->buf)) {
char *temp = REALLOC(parser, tag->buf, bufSize);
if (bufSize > (size_t)(tag->bufEnd - tag->buf.raw)) {
char *temp = REALLOC(parser, tag->buf.raw, bufSize);
if (temp == NULL)
return XML_FALSE;
/* if tag->name.str points to tag->buf (only when namespace
/* if tag->name.str points to tag->buf.str (only when namespace
processing is off) then we have to update it
*/
if (tag->name.str == (XML_Char *)tag->buf)
if (tag->name.str == tag->buf.str)
tag->name.str = (XML_Char *)temp;
/* if tag->name.localPart is set (when namespace processing is on)
then update it as well, since it will always point into tag->buf
*/
if (tag->name.localPart)
tag->name.localPart
= (XML_Char *)temp + (tag->name.localPart - (XML_Char *)tag->buf);
tag->buf = temp;
= (XML_Char *)temp + (tag->name.localPart - tag->buf.str);
tag->buf.raw = temp;
tag->bufEnd = temp + bufSize;
rawNameBuf = temp + nameLen;
}
@@ -3472,12 +3479,12 @@ doContent(XML_Parser parser, int startTagLevel, const ENCODING *enc,
tag = MALLOC(parser, sizeof(TAG));
if (! tag)
return XML_ERROR_NO_MEMORY;
tag->buf = MALLOC(parser, INIT_TAG_BUF_SIZE);
if (! tag->buf) {
tag->buf.raw = MALLOC(parser, INIT_TAG_BUF_SIZE);
if (! tag->buf.raw) {
FREE(parser, tag);
return XML_ERROR_NO_MEMORY;
}
tag->bufEnd = tag->buf + INIT_TAG_BUF_SIZE;
tag->bufEnd = tag->buf.raw + INIT_TAG_BUF_SIZE;
}
tag->bindings = NULL;
tag->parent = parser->m_tagStack;
@@ -3490,31 +3497,32 @@ doContent(XML_Parser parser, int startTagLevel, const ENCODING *enc,
{
const char *rawNameEnd = tag->rawName + tag->rawNameLength;
const char *fromPtr = tag->rawName;
toPtr = (XML_Char *)tag->buf;
toPtr = tag->buf.str;
for (;;) {
int bufSize;
int convLen;
const enum XML_Convert_Result convert_res
= XmlConvert(enc, &fromPtr, rawNameEnd, (ICHAR **)&toPtr,
(ICHAR *)tag->bufEnd - 1);
convLen = (int)(toPtr - (XML_Char *)tag->buf);
convLen = (int)(toPtr - tag->buf.str);
if ((fromPtr >= rawNameEnd)
|| (convert_res == XML_CONVERT_INPUT_INCOMPLETE)) {
tag->name.strLen = convLen;
break;
}
bufSize = (int)(tag->bufEnd - tag->buf) << 1;
if (SIZE_MAX / 2 < (size_t)(tag->bufEnd - tag->buf.raw))
return XML_ERROR_NO_MEMORY;
const size_t bufSize = (size_t)(tag->bufEnd - tag->buf.raw) * 2;
{
char *temp = REALLOC(parser, tag->buf, bufSize);
char *temp = REALLOC(parser, tag->buf.raw, bufSize);
if (temp == NULL)
return XML_ERROR_NO_MEMORY;
tag->buf = temp;
tag->buf.raw = temp;
tag->bufEnd = temp + bufSize;
toPtr = (XML_Char *)temp + convLen;
}
}
}
tag->name.str = (XML_Char *)tag->buf;
tag->name.str = tag->buf.str;
*toPtr = XML_T('\0');
result
= storeAtts(parser, enc, s, &(tag->name), &(tag->bindings), account);
@@ -3878,7 +3886,7 @@ storeAtts(XML_Parser parser, const ENCODING *enc, const char *attStr,
* from -Wtype-limits on platforms where
* sizeof(unsigned int) < sizeof(size_t), e.g. on x86_64. */
#if UINT_MAX >= SIZE_MAX
if ((unsigned)parser->m_attsSize > (size_t)(-1) / sizeof(ATTRIBUTE)) {
if ((unsigned)parser->m_attsSize > SIZE_MAX / sizeof(ATTRIBUTE)) {
parser->m_attsSize = oldAttsSize;
return XML_ERROR_NO_MEMORY;
}
@@ -3897,7 +3905,7 @@ storeAtts(XML_Parser parser, const ENCODING *enc, const char *attStr,
* from -Wtype-limits on platforms where
* sizeof(unsigned int) < sizeof(size_t), e.g. on x86_64. */
# if UINT_MAX >= SIZE_MAX
if ((unsigned)parser->m_attsSize > (size_t)(-1) / sizeof(XML_AttrInfo)) {
if ((unsigned)parser->m_attsSize > SIZE_MAX / sizeof(XML_AttrInfo)) {
parser->m_attsSize = oldAttsSize;
return XML_ERROR_NO_MEMORY;
}
@@ -4073,7 +4081,7 @@ storeAtts(XML_Parser parser, const ENCODING *enc, const char *attStr,
* from -Wtype-limits on platforms where
* sizeof(unsigned int) < sizeof(size_t), e.g. on x86_64. */
#if UINT_MAX >= SIZE_MAX
if (nsAttsSize > (size_t)(-1) / sizeof(NS_ATT)) {
if (nsAttsSize > SIZE_MAX / sizeof(NS_ATT)) {
/* Restore actual size of memory in m_nsAtts */
parser->m_nsAttsPower = oldNsAttsPower;
return XML_ERROR_NO_MEMORY;
@@ -4256,7 +4264,7 @@ storeAtts(XML_Parser parser, const ENCODING *enc, const char *attStr,
* from -Wtype-limits on platforms where
* sizeof(unsigned int) < sizeof(size_t), e.g. on x86_64. */
#if UINT_MAX >= SIZE_MAX
if ((unsigned)(n + EXPAND_SPARE) > (size_t)(-1) / sizeof(XML_Char)) {
if ((unsigned)(n + EXPAND_SPARE) > SIZE_MAX / sizeof(XML_Char)) {
return XML_ERROR_NO_MEMORY;
}
#endif
@@ -4502,7 +4510,7 @@ addBinding(XML_Parser parser, PREFIX *prefix, const ATTRIBUTE_ID *attId,
* from -Wtype-limits on platforms where
* sizeof(unsigned int) < sizeof(size_t), e.g. on x86_64. */
#if UINT_MAX >= SIZE_MAX
if ((unsigned)(len + EXPAND_SPARE) > (size_t)(-1) / sizeof(XML_Char)) {
if ((unsigned)(len + EXPAND_SPARE) > SIZE_MAX / sizeof(XML_Char)) {
return XML_ERROR_NO_MEMORY;
}
#endif
@@ -4529,7 +4537,7 @@ addBinding(XML_Parser parser, PREFIX *prefix, const ATTRIBUTE_ID *attId,
* from -Wtype-limits on platforms where
* sizeof(unsigned int) < sizeof(size_t), e.g. on x86_64. */
#if UINT_MAX >= SIZE_MAX
if ((unsigned)(len + EXPAND_SPARE) > (size_t)(-1) / sizeof(XML_Char)) {
if ((unsigned)(len + EXPAND_SPARE) > SIZE_MAX / sizeof(XML_Char)) {
return XML_ERROR_NO_MEMORY;
}
#endif
@@ -5080,7 +5088,7 @@ entityValueInitProcessor(XML_Parser parser, const char *s, const char *end,
}
/* If we get this token, we have the start of what might be a
normal tag, but not a declaration (i.e. it doesn't begin with
"<!"). In a DTD context, that isn't legal.
"<!" or "<?"). In a DTD context, that isn't legal.
*/
else if (tok == XML_TOK_INSTANCE_START) {
*nextPtr = next;
@@ -5169,6 +5177,15 @@ entityValueProcessor(XML_Parser parser, const char *s, const char *end,
/* found end of entity value - can store it now */
return storeEntityValue(parser, enc, s, end, XML_ACCOUNT_DIRECT, NULL);
}
/* If we get this token, we have the start of what might be a
normal tag, but not a declaration (i.e. it doesn't begin with
"<!" or "<?"). In a DTD context, that isn't legal.
*/
else if (tok == XML_TOK_INSTANCE_START) {
*nextPtr = next;
return XML_ERROR_SYNTAX;
}
start = next;
}
}
@@ -5920,15 +5937,18 @@ doProlog(XML_Parser parser, const ENCODING *enc, const char *s, const char *end,
* from -Wtype-limits on platforms where
* sizeof(unsigned int) < sizeof(size_t), e.g. on x86_64. */
#if UINT_MAX >= SIZE_MAX
if (parser->m_groupSize > (size_t)(-1) / sizeof(int)) {
if (parser->m_groupSize > SIZE_MAX / sizeof(int)) {
parser->m_groupSize /= 2;
return XML_ERROR_NO_MEMORY;
}
#endif
int *const new_scaff_index = REALLOC(
parser, dtd->scaffIndex, parser->m_groupSize * sizeof(int));
if (new_scaff_index == NULL)
if (new_scaff_index == NULL) {
parser->m_groupSize /= 2;
return XML_ERROR_NO_MEMORY;
}
dtd->scaffIndex = new_scaff_index;
}
} else {
@@ -6780,7 +6800,14 @@ storeEntityValue(XML_Parser parser, const ENCODING *enc,
return XML_ERROR_NO_MEMORY;
}
const char *next;
const char *next = entityTextPtr;
/* Nothing to tokenize. */
if (entityTextPtr >= entityTextEnd) {
result = XML_ERROR_NONE;
goto endEntityValue;
}
for (;;) {
next
= entityTextPtr; /* XmlEntityValueTok doesn't always set the last arg */
@@ -7190,7 +7217,7 @@ defineAttribute(ELEMENT_TYPE *type, ATTRIBUTE_ID *attId, XML_Bool isCdata,
* from -Wtype-limits on platforms where
* sizeof(unsigned int) < sizeof(size_t), e.g. on x86_64. */
#if UINT_MAX >= SIZE_MAX
if ((unsigned)count > (size_t)(-1) / sizeof(DEFAULT_ATTRIBUTE)) {
if ((unsigned)count > SIZE_MAX / sizeof(DEFAULT_ATTRIBUTE)) {
return 0;
}
#endif
@@ -7430,16 +7457,24 @@ setContext(XML_Parser parser, const XML_Char *context) {
else {
if (! poolAppendChar(&parser->m_tempPool, XML_T('\0')))
return XML_FALSE;
prefix
= (PREFIX *)lookup(parser, &dtd->prefixes,
poolStart(&parser->m_tempPool), sizeof(PREFIX));
const XML_Char *const prefixName = poolCopyStringNoFinish(
&dtd->pool, poolStart(&parser->m_tempPool));
if (! prefixName) {
return XML_FALSE;
}
prefix = (PREFIX *)lookup(parser, &dtd->prefixes, prefixName,
sizeof(PREFIX));
const bool prefixNameUsed = prefix && prefix->name == prefixName;
if (prefixNameUsed)
poolFinish(&dtd->pool);
else
poolDiscard(&dtd->pool);
if (! prefix)
return XML_FALSE;
if (prefix->name == poolStart(&parser->m_tempPool)) {
prefix->name = poolCopyString(&dtd->pool, prefix->name);
if (! prefix->name)
return XML_FALSE;
}
poolDiscard(&parser->m_tempPool);
}
for (context = s + 1; *context != CONTEXT_SEP && *context != XML_T('\0');
@@ -7666,8 +7701,7 @@ dtdCopy(XML_Parser oldParser, DTD *newDtd, const DTD *oldDtd,
* from -Wtype-limits on platforms where
* sizeof(int) < sizeof(size_t), e.g. on x86_64. */
#if UINT_MAX >= SIZE_MAX
if ((size_t)oldE->nDefaultAtts
> ((size_t)(-1) / sizeof(DEFAULT_ATTRIBUTE))) {
if ((size_t)oldE->nDefaultAtts > SIZE_MAX / sizeof(DEFAULT_ATTRIBUTE)) {
return 0;
}
#endif
@@ -7869,7 +7903,7 @@ lookup(XML_Parser parser, HASH_TABLE *table, KEY name, size_t createSize) {
unsigned long newMask = (unsigned long)newSize - 1;
/* Detect and prevent integer overflow */
if (newSize > (size_t)(-1) / sizeof(NAMED *)) {
if (newSize > SIZE_MAX / sizeof(NAMED *)) {
return NULL;
}
@@ -8028,6 +8062,23 @@ poolCopyString(STRING_POOL *pool, const XML_Char *s) {
return s;
}
// A version of `poolCopyString` that does not call `poolFinish`
// and reverts any partial advancement upon failure.
static const XML_Char *FASTCALL
poolCopyStringNoFinish(STRING_POOL *pool, const XML_Char *s) {
const XML_Char *const original = s;
do {
if (! poolAppendChar(pool, *s)) {
// Revert any previously successful advancement
const ptrdiff_t advancedBy = s - original;
if (advancedBy > 0)
pool->ptr -= advancedBy;
return NULL;
}
} while (*s++);
return pool->start;
}
static const XML_Char *
poolCopyStringN(STRING_POOL *pool, const XML_Char *s, int n) {
if (! pool->ptr && ! poolGrow(pool)) {
@@ -8105,7 +8156,7 @@ poolBytesToAllocateFor(int blockSize) {
static XML_Bool FASTCALL
poolGrow(STRING_POOL *pool) {
if (pool->freeBlocks) {
if (pool->start == 0) {
if (pool->start == NULL) {
pool->blocks = pool->freeBlocks;
pool->freeBlocks = pool->freeBlocks->next;
pool->blocks->next = NULL;
@@ -8217,7 +8268,7 @@ nextScaffoldPart(XML_Parser parser) {
* from -Wtype-limits on platforms where
* sizeof(unsigned int) < sizeof(size_t), e.g. on x86_64. */
#if UINT_MAX >= SIZE_MAX
if (parser->m_groupSize > ((size_t)(-1) / sizeof(int))) {
if (parser->m_groupSize > SIZE_MAX / sizeof(int)) {
return -1;
}
#endif
@@ -8244,7 +8295,7 @@ nextScaffoldPart(XML_Parser parser) {
* from -Wtype-limits on platforms where
* sizeof(unsigned int) < sizeof(size_t), e.g. on x86_64. */
#if UINT_MAX >= SIZE_MAX
if (dtd->scaffSize > (size_t)(-1) / 2u / sizeof(CONTENT_SCAFFOLD)) {
if (dtd->scaffSize > SIZE_MAX / 2u / sizeof(CONTENT_SCAFFOLD)) {
return -1;
}
#endif
@@ -8294,15 +8345,15 @@ build_model(XML_Parser parser) {
* from -Wtype-limits on platforms where
* sizeof(unsigned int) < sizeof(size_t), e.g. on x86_64. */
#if UINT_MAX >= SIZE_MAX
if (dtd->scaffCount > (size_t)(-1) / sizeof(XML_Content)) {
if (dtd->scaffCount > SIZE_MAX / sizeof(XML_Content)) {
return NULL;
}
if (dtd->contentStringLen > (size_t)(-1) / sizeof(XML_Char)) {
if (dtd->contentStringLen > SIZE_MAX / sizeof(XML_Char)) {
return NULL;
}
#endif
if (dtd->scaffCount * sizeof(XML_Content)
> (size_t)(-1) - dtd->contentStringLen * sizeof(XML_Char)) {
> SIZE_MAX - dtd->contentStringLen * sizeof(XML_Char)) {
return NULL;
}
+2 -2
View File
@@ -12,10 +12,11 @@
Copyright (c) 2002-2006 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2002-2003 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
Copyright (c) 2005-2009 Steven Solie <steven@solie.ca>
Copyright (c) 2016-2023 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2019 David Loffredo <loffredo@steptools.com>
Copyright (c) 2021 Donghee Na <donghee.na@python.org>
Copyright (c) 2025 Alfonso Gregory <gfunni234@gmail.com>
Licensed under the MIT license:
Permission is hereby granted, free of charge, to any person obtaining
@@ -46,7 +47,6 @@
# include "winconfig.h"
#endif
#include "expat_external.h"
#include "internal.h"
#include "xmlrole.h"
#include "ascii.h"
+2 -2
View File
@@ -12,7 +12,7 @@
Copyright (c) 2002 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2002-2016 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2005-2009 Steven Solie <steven@solie.ca>
Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016 Pascal Cuoq <cuoq@trust-in-soft.com>
Copyright (c) 2016 Don Lewis <truckman@apache.org>
Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
@@ -24,6 +24,7 @@
Copyright (c) 2022 Martin Ettl <ettl.martin78@googlemail.com>
Copyright (c) 2022 Sean McBride <sean@rogue-research.com>
Copyright (c) 2023 Hanno Böck <hanno@gentoo.org>
Copyright (c) 2025 Alfonso Gregory <gfunni234@gmail.com>
Licensed under the MIT license:
Permission is hereby granted, free of charge, to any person obtaining
@@ -56,7 +57,6 @@
# include "winconfig.h"
#endif
#include "expat_external.h"
#include "internal.h"
#include "xmltok.h"
#include "nametab.h"
+4 -3
View File
@@ -11,7 +11,8 @@
Copyright (c) 2002 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2002 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
Copyright (c) 2002-2006 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2017-2021 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2025 Alfonso Gregory <gfunni234@gmail.com>
Licensed under the MIT license:
Permission is hereby granted, free of charge, to any person obtaining
@@ -98,13 +99,13 @@ NS(findEncoding)(const ENCODING *enc, const char *ptr, const char *end) {
int i;
XmlUtf8Convert(enc, &ptr, end, &p, p + ENCODING_MAX - 1);
if (ptr != end)
return 0;
return NULL;
*p = 0;
if (streqci(buf, KW_UTF_16) && enc->minBytesPerChar == 2)
return enc;
i = getEncodingIndex(buf);
if (i == UNKNOWN_ENC)
return 0;
return NULL;
return NS(encodings)[i];
}
+3
View File
@@ -616,6 +616,9 @@ SO_MINOR = @SO_MINOR@
SO_PATCH = @SO_PATCH@
STRIP = @STRIP@
VERSION = @VERSION@
VSCRIPT_LDFLAGS = @VSCRIPT_LDFLAGS@
_EXPAT_COMMENT_ATTR_INFO = @_EXPAT_COMMENT_ATTR_INFO@
_EXPAT_COMMENT_DTD_OR_GE = @_EXPAT_COMMENT_DTD_OR_GE@
abs_builddir = @abs_builddir@
abs_srcdir = @abs_srcdir@
abs_top_builddir = @abs_top_builddir@
+70 -4
View File
@@ -10,7 +10,7 @@
Copyright (c) 2003 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2005-2007 Steven Solie <steven@solie.ca>
Copyright (c) 2005-2012 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017-2022 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2017 Joe Orton <jorton@redhat.com>
Copyright (c) 2017 José Gutiérrez de la Concha <jose@zeroc.com>
@@ -3112,12 +3112,16 @@ START_TEST(test_buffer_can_grow_to_max) {
#if defined(__MINGW32__) && ! defined(__MINGW64__)
// workaround for mingw/wine32 on GitHub CI not being able to reach 1GiB
// Can we make a big allocation?
void *big = malloc(maxbuf);
if (! big) {
for (int i = 1; i <= 2; i++) {
void *const big = malloc(maxbuf);
if (big != NULL) {
free(big);
break;
}
// The big allocation failed. Let's be a little lenient.
maxbuf = maxbuf / 2;
fprintf(stderr, "Reducing maxbuf to %d...\n", maxbuf);
}
free(big);
#endif
for (int i = 0; i < num_prefixes; ++i) {
@@ -4570,6 +4574,46 @@ START_TEST(test_unknown_encoding_invalid_attr_value) {
}
END_TEST
START_TEST(test_unknown_encoding_user_data_primary) {
// This test is based on ideas contributed by Artiphishell Inc.
const char *const text = "<?xml version='1.0' encoding='x-unk'?>\n"
"<root />\n";
XML_Parser parser = XML_ParserCreate(NULL);
XML_SetUnknownEncodingHandler(parser,
user_data_checking_unknown_encoding_handler,
(void *)(intptr_t)0xC0FFEE);
assert_true(_XML_Parse_SINGLE_BYTES(parser, text, (int)strlen(text), XML_TRUE)
== XML_STATUS_OK);
XML_ParserFree(parser);
}
END_TEST
START_TEST(test_unknown_encoding_user_data_secondary) {
// This test is based on ideas contributed by Artiphishell Inc.
const char *const text_main = "<!DOCTYPE r [\n"
" <!ENTITY ext SYSTEM 'ext.ent'>\n"
"]>\n"
"<r>&ext;</r>\n";
const char *const text_external = "<?xml version='1.0' encoding='x-unk'?>\n"
"<e>data</e>";
ExtTest2 test_data = {text_external, (int)strlen(text_external), NULL, NULL};
XML_Parser parser = XML_ParserCreate(NULL);
XML_SetExternalEntityRefHandler(parser, external_entity_loader2);
XML_SetUnknownEncodingHandler(parser,
user_data_checking_unknown_encoding_handler,
(void *)(intptr_t)0xC0FFEE);
XML_SetUserData(parser, &test_data);
assert_true(_XML_Parse_SINGLE_BYTES(parser, text_main, (int)strlen(text_main),
XML_TRUE)
== XML_STATUS_OK);
XML_ParserFree(parser);
}
END_TEST
/* Test an external entity parser set to use latin-1 detects UTF-16
* BOMs correctly.
*/
@@ -6001,6 +6045,7 @@ START_TEST(test_bypass_heuristic_when_close_to_bufsize) {
const int document_length = 65536;
char *const document = (char *)malloc(document_length);
assert_true(document != NULL);
const XML_Memory_Handling_Suite memfuncs = {
counting_malloc,
@@ -6213,6 +6258,24 @@ START_TEST(test_varying_buffer_fills) {
}
END_TEST
START_TEST(test_empty_ext_param_entity_in_value) {
const char *text = "<!DOCTYPE r SYSTEM \"ext.dtd\"><r/>";
ExtOption options[] = {
{XCS("ext.dtd"), "<!ENTITY % pe SYSTEM \"empty\">"
"<!ENTITY ge \"%pe;\">"},
{XCS("empty"), ""},
{NULL, NULL},
};
XML_SetParamEntityParsing(g_parser, XML_PARAM_ENTITY_PARSING_ALWAYS);
XML_SetExternalEntityRefHandler(g_parser, external_entity_optioner);
XML_SetUserData(g_parser, options);
if (_XML_Parse_SINGLE_BYTES(g_parser, text, (int)strlen(text), XML_TRUE)
== XML_STATUS_ERROR)
xml_failure(g_parser);
}
END_TEST
void
make_basic_test_case(Suite *s) {
TCase *tc_basic = tcase_create("basic tests");
@@ -6416,6 +6479,8 @@ make_basic_test_case(Suite *s) {
tcase_add_test(tc_basic, test_unknown_encoding_invalid_surrogate);
tcase_add_test(tc_basic, test_unknown_encoding_invalid_high);
tcase_add_test(tc_basic, test_unknown_encoding_invalid_attr_value);
tcase_add_test(tc_basic, test_unknown_encoding_user_data_primary);
tcase_add_test(tc_basic, test_unknown_encoding_user_data_secondary);
tcase_add_test__if_xml_ge(tc_basic, test_ext_entity_latin1_utf16le_bom);
tcase_add_test__if_xml_ge(tc_basic, test_ext_entity_latin1_utf16be_bom);
tcase_add_test__if_xml_ge(tc_basic, test_ext_entity_latin1_utf16le_bom2);
@@ -6458,6 +6523,7 @@ make_basic_test_case(Suite *s) {
tcase_add_test(tc_basic, test_empty_element_abort);
tcase_add_test__ifdef_xml_dtd(tc_basic,
test_pool_integrity_with_unfinished_attr);
tcase_add_test__ifdef_xml_dtd(tc_basic, test_empty_ext_param_entity_in_value);
tcase_add_test__if_xml_ge(tc_basic, test_entity_ref_no_elements);
tcase_add_test__if_xml_ge(tc_basic, test_deep_nested_entity);
tcase_add_test__if_xml_ge(tc_basic, test_deep_nested_attribute_entity);
+3
View File
@@ -311,6 +311,9 @@ SO_MINOR = @SO_MINOR@
SO_PATCH = @SO_PATCH@
STRIP = @STRIP@
VERSION = @VERSION@
VSCRIPT_LDFLAGS = @VSCRIPT_LDFLAGS@
_EXPAT_COMMENT_ATTR_INFO = @_EXPAT_COMMENT_ATTR_INFO@
_EXPAT_COMMENT_DTD_OR_GE = @_EXPAT_COMMENT_DTD_OR_GE@
abs_builddir = @abs_builddir@
abs_srcdir = @abs_srcdir@
abs_top_builddir = @abs_top_builddir@
+11 -1
View File
@@ -10,7 +10,7 @@
Copyright (c) 2003 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2005-2007 Steven Solie <steven@solie.ca>
Copyright (c) 2005-2012 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017-2022 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2017 Joe Orton <jorton@redhat.com>
Copyright (c) 2017 José Gutiérrez de la Concha <jose@zeroc.com>
@@ -45,6 +45,7 @@
# undef NDEBUG /* because test suite relies on assert(...) at the moment */
#endif
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
@@ -407,6 +408,15 @@ long_encoding_handler(void *userData, const XML_Char *encoding,
return XML_STATUS_OK;
}
int XMLCALL
user_data_checking_unknown_encoding_handler(void *userData,
const XML_Char *encoding,
XML_Encoding *info) {
const intptr_t number = (intptr_t)userData;
assert_true(number == 0xC0FFEE);
return long_encoding_handler(userData, encoding, info);
}
/* External Entity Handlers */
int XMLCALL
+4 -1
View File
@@ -10,7 +10,7 @@
Copyright (c) 2003 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2005-2007 Steven Solie <steven@solie.ca>
Copyright (c) 2005-2012 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2016-2024 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017-2022 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2017 Joe Orton <jorton@redhat.com>
Copyright (c) 2017 José Gutiérrez de la Concha <jose@zeroc.com>
@@ -159,6 +159,9 @@ extern int XMLCALL long_encoding_handler(void *userData,
const XML_Char *encoding,
XML_Encoding *info);
extern int XMLCALL user_data_checking_unknown_encoding_handler(
void *userData, const XML_Char *encoding, XML_Encoding *info);
/* External Entity Handlers */
typedef struct ExtOption {
+33 -2
View File
@@ -10,7 +10,7 @@
Copyright (c) 2003 Greg Stein <gstein@users.sourceforge.net>
Copyright (c) 2005-2007 Steven Solie <steven@solie.ca>
Copyright (c) 2005-2012 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017-2022 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2017 Joe Orton <jorton@redhat.com>
Copyright (c) 2017 José Gutiérrez de la Concha <jose@zeroc.com>
@@ -19,6 +19,7 @@
Copyright (c) 2020 Tim Gates <tim.gates@iress.com>
Copyright (c) 2021 Donghee Na <donghee.na@python.org>
Copyright (c) 2023 Sony Corporation / Snild Dolkow <snild@sony.com>
Copyright (c) 2025 Berkay Eren Ürün <berkay.ueruen@siemens.com>
Licensed under the MIT license:
Permission is hereby granted, free of charge, to any person obtaining
@@ -211,7 +212,7 @@ START_TEST(test_misc_version) {
if (! versions_equal(&read_version, &parsed_version))
fail("Version mismatch");
if (xcstrcmp(version_text, XCS("expat_2.7.3"))
if (xcstrcmp(version_text, XCS("expat_2.7.5"))
!= 0) /* needs bump on releases */
fail("XML_*_VERSION in expat.h out of sync?\n");
}
@@ -771,6 +772,35 @@ START_TEST(test_misc_async_entity_rejected) {
}
END_TEST
START_TEST(test_misc_no_infinite_loop_issue_1161) {
XML_Parser parser = XML_ParserCreate(NULL);
const char *text = "<!DOCTYPE d SYSTEM 'secondary.txt'>";
struct ExtOption options[] = {
{XCS("secondary.txt"),
"<!ENTITY % p SYSTEM 'tertiary.txt'><!ENTITY g '%p;'>"},
{XCS("tertiary.txt"), "<?xml version='1.0'?><a"},
{NULL, NULL},
};
XML_SetUserData(parser, options);
XML_SetParamEntityParsing(parser, XML_PARAM_ENTITY_PARSING_ALWAYS);
XML_SetExternalEntityRefHandler(parser, external_entity_optioner);
assert_true(_XML_Parse_SINGLE_BYTES(parser, text, (int)strlen(text), XML_TRUE)
== XML_STATUS_ERROR);
#if defined(XML_DTD)
assert_true(XML_GetErrorCode(parser) == XML_ERROR_EXTERNAL_ENTITY_HANDLING);
#else
assert_true(XML_GetErrorCode(parser) == XML_ERROR_NO_ELEMENTS);
#endif
XML_ParserFree(parser);
}
END_TEST
void
make_miscellaneous_test_case(Suite *s) {
TCase *tc_misc = tcase_create("miscellaneous tests");
@@ -801,4 +831,5 @@ make_miscellaneous_test_case(Suite *s) {
tcase_add_test(tc_misc, test_misc_expected_event_ptr_issue_980);
tcase_add_test(tc_misc, test_misc_sync_entity_tolerated);
tcase_add_test(tc_misc, test_misc_async_entity_rejected);
tcase_add_test(tc_misc, test_misc_no_infinite_loop_issue_1161);
}
+27
View File
@@ -1505,6 +1505,32 @@ START_TEST(test_nsalloc_prefixed_element) {
}
END_TEST
/* Verify that retry after OOM in setContext() does not crash.
*/
START_TEST(test_nsalloc_setContext_zombie) {
const char *text = "<doc>Hello</doc>";
unsigned int i;
const unsigned int max_alloc_count = 30;
for (i = 0; i < max_alloc_count; i++) {
g_allocation_count = (int)i;
if (XML_Parse(g_parser, text, (int)strlen(text), XML_TRUE)
!= XML_STATUS_ERROR)
break;
/* Retry on the same parser — must not crash */
g_allocation_count = ALLOC_ALWAYS_SUCCEED;
XML_Parse(g_parser, text, (int)strlen(text), XML_TRUE);
nsalloc_teardown();
nsalloc_setup();
}
if (i == 0)
fail("Parsing worked despite failing allocations");
else if (i == max_alloc_count)
fail("Parsing failed even at maximum allocation count");
}
END_TEST
void
make_nsalloc_test_case(Suite *s) {
TCase *tc_nsalloc = tcase_create("namespace allocation tests");
@@ -1539,4 +1565,5 @@ make_nsalloc_test_case(Suite *s) {
tcase_add_test__if_xml_ge(tc_nsalloc, test_nsalloc_long_default_in_ext);
tcase_add_test(tc_nsalloc, test_nsalloc_long_systemid_in_ext);
tcase_add_test(tc_nsalloc, test_nsalloc_prefixed_element);
tcase_add_test(tc_nsalloc, test_nsalloc_setContext_zombie);
}
+3
View File
@@ -319,6 +319,9 @@ SO_MINOR = @SO_MINOR@
SO_PATCH = @SO_PATCH@
STRIP = @STRIP@
VERSION = @VERSION@
VSCRIPT_LDFLAGS = @VSCRIPT_LDFLAGS@
_EXPAT_COMMENT_ATTR_INFO = @_EXPAT_COMMENT_ATTR_INFO@
_EXPAT_COMMENT_DTD_OR_GE = @_EXPAT_COMMENT_DTD_OR_GE@
abs_builddir = @abs_builddir@
abs_srcdir = @abs_srcdir@
abs_top_builddir = @abs_top_builddir@
+2 -2
View File
@@ -11,11 +11,12 @@
Copyright (c) 2002-2003 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
Copyright (c) 2004-2006 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2005-2007 Steven Solie <steven@solie.ca>
Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2019 David Loffredo <loffredo@steptools.com>
Copyright (c) 2021 Donghee Na <donghee.na@python.org>
Copyright (c) 2024 Hanno Böck <hanno@gentoo.org>
Copyright (c) 2025 Alfonso Gregory <gfunni234@gmail.com>
Licensed under the MIT license:
Permission is hereby granted, free of charge, to any person obtaining
@@ -225,7 +226,6 @@ processStream(const XML_Char *filename, XML_Parser parser) {
if (filename != NULL)
close(fd);
break;
;
}
}
return 1;
+7 -6
View File
@@ -11,7 +11,7 @@
Copyright (c) 2001-2003 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
Copyright (c) 2004-2009 Karl Waclawek <karl@waclawek.net>
Copyright (c) 2005-2007 Steven Solie <steven@solie.ca>
Copyright (c) 2016-2025 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2016-2026 Sebastian Pipping <sebastian@pipping.org>
Copyright (c) 2017 Rhodri James <rhodri@wildebeest.org.uk>
Copyright (c) 2019 David Loffredo <loffredo@steptools.com>
Copyright (c) 2020 Joe Orton <jorton@redhat.com>
@@ -19,6 +19,7 @@
Copyright (c) 2021 Tim Bray <tbray@textuality.com>
Copyright (c) 2022 Martin Ettl <ettl.martin78@googlemail.com>
Copyright (c) 2022 Sean McBride <sean@rogue-research.com>
Copyright (c) 2025 Alfonso Gregory <gfunni234@gmail.com>
Licensed under the MIT license:
Permission is hereby granted, free of charge, to any person obtaining
@@ -390,16 +391,13 @@ endDoctypeDecl(void *userData) {
notationCount++;
if (notationCount == 0) {
/* Nothing to report */
free((void *)data->currentDoctypeName);
data->currentDoctypeName = NULL;
return;
goto cleanUp;
}
notations = malloc(notationCount * sizeof(NotationList *));
if (notations == NULL) {
fprintf(stderr, "Unable to sort notations");
freeNotations(data);
return;
goto cleanUp;
}
for (p = data->notationListHead, i = 0; i < notationCount; p = p->next, i++) {
@@ -439,6 +437,8 @@ endDoctypeDecl(void *userData) {
fputts(T("]>\n"), data->fp);
free(notations);
cleanUp:
freeNotations(data);
free((void *)data->currentDoctypeName);
data->currentDoctypeName = NULL;
@@ -900,6 +900,7 @@ usage(const XML_Char *prog, int rc) {
T(" -n enable [n]amespace processing\n")
T(" -p enable processing of external DTDs and [p]arameter entities\n")
T(" -x enable processing of e[x]ternal entities\n")
T(" (CAREFUL! This makes xmlwf vulnerable to external entity attacks (XXE).)\n")
T(" -e ENCODING override any in-document [e]ncoding declaration\n")
T(" -w enable support for [W]indows code pages\n")
T(" -r disable memory-mapping and use [r]ead calls instead\n")
+130 -56
View File
@@ -6,7 +6,7 @@
# \___/_/\_\ .__/ \__,_|\__|
# |_| XML parser
#
# Copyright (c) 2019-2025 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2019-2026 Sebastian Pipping <sebastian@pipping.org>
# Copyright (c) 2021 Tim Bray <tbray@textuality.com>
# Licensed under the MIT license:
#
@@ -30,28 +30,31 @@
# USE OR OTHER DEALINGS IN THE SOFTWARE.
import argparse
from textwrap import dedent
epilog = """
environment variables:
EXPAT_ACCOUNTING_DEBUG=(0|1|2|3)
Control verbosity of accounting debugging (default: 0)
EXPAT_ENTITY_DEBUG=(0|1)
Control verbosity of entity debugging (default: 0)
EXPAT_ENTROPY_DEBUG=(0|1)
Control verbosity of entropy debugging (default: 0)
EXPAT_MALLOC_DEBUG=(0|1|2)
Control verbosity of allocation tracker (default: 0)
epilog = dedent(
"""
environment variables:
EXPAT_ACCOUNTING_DEBUG=(0|1|2|3)
Control verbosity of accounting debugging (default: 0)
EXPAT_ENTITY_DEBUG=(0|1)
Control verbosity of entity debugging (default: 0)
EXPAT_ENTROPY_DEBUG=(0|1)
Control verbosity of entropy debugging (default: 0)
EXPAT_MALLOC_DEBUG=(0|1|2)
Control verbosity of allocation tracker (default: 0)
exit status:
0 the input files are well-formed and the output (if requested) was written successfully
1 could not allocate data structures, signals a serious problem with execution environment
2 one or more input files were not well-formed
3 could not create an output file
4 command-line argument error
exit status:
0 the input files are well-formed and the output (if requested) was written successfully
1 could not allocate data structures, signals a serious problem with execution environment
2 one or more input files were not well-formed
3 could not create an output file
4 command-line argument error
xmlwf of libexpat is software libre, licensed under the MIT license.
Please report bugs at https://github.com/libexpat/libexpat/issues -- thank you!
"""
xmlwf of libexpat is software libre, licensed under the MIT license.
Please report bugs at https://github.com/libexpat/libexpat/issues -- thank you!
"""
)
usage = """
%(prog)s [OPTIONS] [FILE ...]
@@ -59,50 +62,121 @@
%(prog)s -v|--version
"""
parser = argparse.ArgumentParser(prog='xmlwf', add_help=False,
usage=usage,
description='xmlwf - Determines if an XML document is well-formed',
formatter_class=argparse.RawTextHelpFormatter,
epilog=epilog)
parser = argparse.ArgumentParser(
prog="xmlwf",
add_help=False,
usage=usage,
description="xmlwf - Determines if an XML document is well-formed",
formatter_class=argparse.RawTextHelpFormatter,
epilog=epilog,
)
input_related = parser.add_argument_group('input control arguments')
input_related.add_argument('-s', action='store_true', help='print an error if the document is not [s]tandalone')
input_related.add_argument('-n', action='store_true', help='enable [n]amespace processing')
input_related.add_argument('-p', action='store_true', help='enable processing of external DTDs and [p]arameter entities')
input_related.add_argument('-x', action='store_true', help='enable processing of e[x]ternal entities')
input_related.add_argument('-e', action='store', metavar='ENCODING', help='override any in-document [e]ncoding declaration')
input_related.add_argument('-w', action='store_true', help='enable support for [W]indows code pages')
input_related.add_argument('-r', action='store_true', help='disable memory-mapping and use [r]ead calls instead')
input_related.add_argument('-g', metavar='BYTES', help='buffer size to request per call pair to XML_[G]etBuffer and read (default: 8 KiB)')
input_related.add_argument('-k', action='store_true', help='when processing multiple files, [k]eep processing after first file with error')
input_related = parser.add_argument_group("input control arguments")
input_related.add_argument(
"-s", action="store_true", help="print an error if the document is not [s]tandalone"
)
input_related.add_argument(
"-n", action="store_true", help="enable [n]amespace processing"
)
input_related.add_argument(
"-p",
action="store_true",
help="enable processing of external DTDs and [p]arameter entities",
)
input_related.add_argument(
"-x",
action="store_true",
help=(
"enable processing of e[x]ternal entities"
"\n"
"(CAREFUL! This makes xmlwf vulnerable to external entity attacks (XXE).)"
),
)
input_related.add_argument(
"-e",
action="store",
metavar="ENCODING",
help="override any in-document [e]ncoding declaration",
)
input_related.add_argument(
"-w", action="store_true", help="enable support for [W]indows code pages"
)
input_related.add_argument(
"-r",
action="store_true",
help="disable memory-mapping and use [r]ead calls instead",
)
input_related.add_argument(
"-g",
metavar="BYTES",
help="buffer size to request per call pair to XML_[G]etBuffer and read (default: 8 KiB)",
)
input_related.add_argument(
"-k",
action="store_true",
help="when processing multiple files, [k]eep processing after first file with error",
)
output_related = parser.add_argument_group('output control arguments')
output_related.add_argument('-d', action='store', metavar='DIRECTORY', help='output [d]estination directory')
output_related = parser.add_argument_group("output control arguments")
output_related.add_argument(
"-d", action="store", metavar="DIRECTORY", help="output [d]estination directory"
)
output_mode = output_related.add_mutually_exclusive_group()
output_mode.add_argument('-c', action='store_true', help='write a [c]opy of input XML, not canonical XML')
output_mode.add_argument('-m', action='store_true', help='write [m]eta XML, not canonical XML')
output_mode.add_argument('-t', action='store_true', help='write no XML output for [t]iming of plain parsing')
output_related.add_argument('-N', action='store_true', help='enable adding doctype and [n]otation declarations')
output_mode.add_argument(
"-c", action="store_true", help="write a [c]opy of input XML, not canonical XML"
)
output_mode.add_argument(
"-m", action="store_true", help="write [m]eta XML, not canonical XML"
)
output_mode.add_argument(
"-t", action="store_true", help="write no XML output for [t]iming of plain parsing"
)
output_related.add_argument(
"-N", action="store_true", help="enable adding doctype and [n]otation declarations"
)
billion_laughs = parser.add_argument_group('amplification attack protection (e.g. billion laughs)',
description='NOTE: '
'If you ever need to increase these values '
'for non-attack payload, please file a bug report.')
billion_laughs.add_argument('-a', metavar='FACTOR',
help='set maximum tolerated [a]mplification factor (default: 100.0)')
billion_laughs.add_argument('-b', metavar='BYTES', help='set number of output [b]ytes needed to activate (default: 8 MiB/64 MiB)')
billion_laughs = parser.add_argument_group(
"amplification attack protection (e.g. billion laughs)",
description=(
"NOTE: "
"If you ever need to increase these values "
"for non-attack payload, please file a bug report."
),
)
billion_laughs.add_argument(
"-a",
metavar="FACTOR",
help="set maximum tolerated [a]mplification factor (default: 100.0)",
)
billion_laughs.add_argument(
"-b",
metavar="BYTES",
help="set number of output [b]ytes needed to activate (default: 8 MiB/64 MiB)",
)
reparse_deferral = parser.add_argument_group('reparse deferral')
reparse_deferral.add_argument('-q', action='store_true',
help='disable reparse deferral, and allow [q]uadratic parse runtime with large tokens')
reparse_deferral = parser.add_argument_group("reparse deferral")
reparse_deferral.add_argument(
"-q",
action="store_true",
help="disable reparse deferral, and allow [q]uadratic parse runtime with large tokens",
)
parser.add_argument('files', metavar='FILE', nargs='*', help='file to process (default: STDIN)')
parser.add_argument(
"files", metavar="FILE", nargs="*", help="file to process (default: STDIN)"
)
info = parser.add_argument_group('info arguments')
info = parser.add_argument_group("info arguments")
info = info.add_mutually_exclusive_group()
info.add_argument('-h', '--help', action='store_true', help='show this [h]elp message and exit')
info.add_argument('-v', '--version', action='store_true', help='show program\'s [v]ersion number and exit')
info.add_argument(
"-h", "--help", action="store_true", help="show this [h]elp message and exit"
)
info.add_argument(
"-v",
"--version",
action="store_true",
help="show program's [v]ersion number and exit",
)
if __name__ == '__main__':
if __name__ == "__main__":
parser.print_help()