ALib C++ Framework
by
Library Version: 2511 R0
Documentation generated by doxygen
Loading...
Searching...
No Matches
formatterpythonstyle.inl
Go to the documentation of this file.
1//==================================================================================================
2/// \file
3/// This header-file is part of module \alib_format of the \aliblong.
4///
5/// \emoji :copyright: 2013-2025 A-Worx GmbH, Germany.
6/// Published under #"mainpage_license".
7//==================================================================================================
9
10//==================================================================================================
11/// Implements a #"format::Formatter" according to the
12/// \https{formatting standards of the Python language,docs.python.org/3.5/library/string.html#format-string-syntax}.
13///
14/// \note
15/// Inherited, public fields of parent class \b FormatterStdImpl provide important possibilities
16/// for changing the formatting behavior of instances of this class. Therefore, do not forget
17/// to consult the #"alib::format::FormatterStdImpl;parent classes documentation".
18///
19/// In general, the original \b Python specification is covered quite well. However, there are
20/// some differences, some things are not possible (considering python being a scripting language)
21/// but then there are also found some very helpful extensions to that standard. Instead of repeating
22/// a complete documentation, please refer to the
23/// \https{Python Documentation,docs.python.org/3.5/library/string.html#format-string-syntax}
24/// as the foundation and then take note of the following list of differences, extensions and
25/// general hints:
26///
27/// - <b>General Notes:</b>
28/// \b Python defines a placeholder field as follows
29///
30/// "{" [field_name] ["!" conversion] [":" format_spec] "}"
31///
32///
33/// - This formatter is <b>less strict</b> in respect to the order of the format symbols. E.g.
34/// it allows <c>{:11.5,}</c> where Python allows only <c>{:11,.5}</c>
35///
36/// - With this class being derived from
37/// #"alib::format::FormatterStdImpl;FormatterStdImpl", features of the parent are
38/// available to this formatter as well. This is especially true and sometimes useful in respect to
39/// setting default values number formatting. For example, this allows modifying all number output
40/// without explicitly repeating the settings in each placeholder of format strings. Other options,
41/// for example, the grouping characters used with hexadecimal numbers, cannot be even changed
42/// with the <b>Python Style</b> formatting options. The only way of doing so is modifying the
43/// properties of the formatter object before the format operation.
44///
45/// - Nested replacements in format specification fields are (by nature of this implementation
46/// language) \b not supported.
47///
48/// <p>
49/// - <b>Positional arguments and field name:</b>
50/// - By the nature of the implementation language (<em>C++, no introspection</em>) of this class,
51/// \b field_name can \b not be the name of an identifier, an attribute name or an array element
52/// index. It can only be a positional argument index, hence a number that chooses a different
53/// index in the provided argument list.<br>
54/// However, the use of field names is often a requirement in use cases that offer configurable
55/// format string setup to the "end user". Therefore, there are two alternatives to cope
56/// with the limitation:
57/// - In simple cases, it is possible to just add all optionally needed data in the argument list,
58/// document their index position and let the user use positional argument notation to choose
59/// the right value from the list.
60/// - More elegant however, is the use of class
61/// #"alib::format::PropertyFormatter;PropertyFormatter"
62/// which extends the format specification by custom identifiers which control the placement
63/// of corresponding data in the format argument list. This class uses a translator table from
64/// identifier strings to custom callback functions. This way, much more than just simple
65/// field names are allowed.
66///
67/// - When using positional arguments in a format string placeholders, the Python formatter
68/// implementation does not allow to switch from <b>automatic field indexing</b> to explicit
69/// indexing. This \b %Aib implementation does allow it. The automatic index (aka no positional
70/// argument is given for a next placeholder) always starts with index \c 0 and is incremented
71/// each time automatic indexing is used. Occurrences of explict indexing have no influence
72/// on the automatic indexing.
73///
74///
75/// <p>
76/// - <b>Binary, Hexadecimal and Octal Numbers:</b>
77/// - Binary, hexadecimal and octal output is <b>cut in size</b> (!) when a field width is given that
78/// is smaller than the resulting amount of digits of the number arguments provided.
79/// \note This implies that a value written might not be equal to the value given.
80/// This is not a bug but a design decision. The rationale behind this is that with this
81/// behavior, there is no need to mask lower digits when passing the arguments to the
82/// format invocation. In other words, the formatter "assumes" that the given field width
83/// indicates that only a corresponding number of lower digits are of interest.
84///
85/// - If no width is given and the argument contains a boxed pointer, then the platform-dependent
86/// full output width of pointer types is used.
87///
88/// - The number <b>grouping option</b> (<c>','</c>) can also be used with binary, hexadecimal and octal
89/// output.
90/// The types support different grouping separators for nibbles, bytes, 16-bit and 32-bit words.
91/// Changing the separator symbols, is not possible with the format fields of the format strings
92/// (if it was, this would become very incompatible to Python standards). Changes have to be made
93/// before the format operation by modifying the field
94/// #"^FormatterPythonStyle::AlternativeNumberFormat" which is provided through parent class
95/// \b %Formatter.
96///
97/// - Alternative form (\c '#"')" adds prefixes as specified in members
98/// - #"TNumberFormat::BinLiteralPrefix",
99/// - #"TNumberFormat::HexLiteralPrefix", and
100/// - #"TNumberFormat::OctLiteralPrefix".
101///
102/// For upper case formats, those are taken from the inherited field
103/// #"^FormatterPythonStyle::DefaultNumberFormat", for lower case formats from
104/// #"^FormatterPythonStyle::AlternativeNumberFormat".
105/// However, in alignment with the \b Python specification, \b both default to lower case
106/// literals \c "0b", \c "0o" and \c "0x". The user may change all defaults.
107///
108///
109/// <p>
110/// - <b>Floating point values:</b>
111/// - If floating point values are provided without a type specification in the format string,
112/// then all values of the inherited field #"^FormatterPythonStyle::DefaultNumberFormat" are used to
113/// format the number
114/// - For lower case floating point format types (\c 'f' and \c 'e'), the values specified in
115/// attributes \b %ExponentSeparator, \b %NANLiteral and \b %INFLiteral of the inherited field
116/// #"^FormatterPythonStyle::AlternativeNumberFormat" are used.
117/// For upper case types (\c 'F' and \c 'E') the corresponding attributes in the
118/// field #"^FormatterPythonStyle::DefaultNumberFormat" apply.
119/// - Fixed point formats (\c 'f' and 'F' types) are not supported to use an arbitrary length.
120/// See class #"TNumberFormat;NumberFormat" for the limits.
121/// Also, very high values and values close to zero may be converted to scientific format.
122/// Finally, if flag #"NumberFormatFlags::ForceScientific" field
123/// #"TNumberFormat::Flags" in member #"DefaultNumberFormat" is \c true, types
124/// \c 'f' and 'F' behave like types \c 'e' and 'E'.
125/// - When both, a \p{width} and a \p{precision} is given, then the \p{precision} determines the
126/// fractional part, even if the type is \b 'g' or \b 'G'. This is different than specified with
127/// Python formatter, which uses \p{precision} as the overall width in case of types
128/// \b 'g' or \b 'G'.
129/// - The 'general format' type for floats, specified with \c 'g' or \c 'G' in the python
130/// implementation limits the precision of the fractional part, even if \p{precision} is not
131/// further specified. This implementation does limit the precision only if type is \c 'f'
132/// or \c 'F'.
133///
134/// <p>
135/// - <b>%String Conversion:</b><br>
136/// If \e type \c 's' (or no \e type) is given in the \b format_spec of the replacement field,
137/// a string representation of the given argument is used.
138/// In \b Java and \b C# such representation is received by invoking <c>Object.[t|T]oString()</c>.
139/// Consequently, to support string representations of custom types, in these languages
140/// the corresponding <b>[t|T]oString()</b> methods of the type have to be implemented.
141///
142/// In C++ the arguments are "boxed" into objects of type
143/// #"alib::boxing::Box;Box". For the string representation, the formatter invokes
144/// box-function #"FAppend". A default implementation exists which
145/// for custom types appends the type name and the memory address of the object in hexadecimal
146/// format. To support custom string representations (for custom types), this box-function
147/// needs to be implemented for the type in question. Information and sample code on how to do this
148/// is found in the documentation of \alib_boxing , chapter
149/// #"alib_boxing_strings_fappend".
150///
151/// - <b>Hash-Value Output:</b><br>
152/// In extension (and deviation) of the Python specification, format specification type \c 'h' and
153/// its upper case version \c 'H' is implemented. The hash-values of the argument object is
154/// written in hexadecimal format. Options of the type are identical to those of \c 'x',
155/// respectively \c 'X'.
156///
157/// In the C++ language implementation of \alib, instead of hash-values of objects, the pointer
158/// found in method #"Box::Data" is printed. In case of boxed class-types and default
159/// boxing mechanics are used with such class types, this will show the memory address of
160/// the given instance.
161///
162/// - <b>Boolean output:</b><br>
163/// In extension (and deviation) of the Python specification, format specification type \c 'B'
164/// is implemented. The word \b "true" is written if the given value represents a boolean \c true
165/// value, \b "false" otherwise.
166///
167/// In the C++ language implementation of \alib, the argument is evaluated to boolean by invoking
168/// box-function #"FIsTrue".
169///
170/// <p>
171/// - <b>%Custom %Format Specifications:</b><br>
172/// With \c Python formatting syntax, placeholders have the following syntax:
173///
174/// "{" [field_name] ["!" conversion] [":" format_spec] "}"
175///
176/// The part that follows the colon is called \b format_spec. \b Python passes this portion of the
177/// placeholder to a built-in function \c format(). Now, each type may interpret this string in a
178/// type specific way. But most built-in \b Python types do it along what they call the
179/// \https{"Format Specification Mini Language",docs.python.org/3.5/library/string.html#format-specification-mini-language}.
180///
181/// With this implementation, the approach is very similar. The only difference is that the
182/// "Format Specification Mini Language" is implemented for standard types right within this class.
183/// But before processing \b format_spec, this class will check if the argument type assigned to
184/// the placeholder disposes of a custom implementation of box function #"FFormat".
185/// If so, this function is invoked and string \b format_spec is passed for custom processing.
186///
187/// Information and sample code on how to adopt custom types to support this interface is
188/// found in the Programmer's Manual of this module, with chapter
189/// #"alib_format_custom_types_fformat".
190///
191/// For example, \alib class #"time::DateTime" supports custom formatting with box-function
192/// #"FFormat_DateTime" which uses helper-class
193/// #"util::CalendarDateTime" that provides a very common specific mini language
194/// for #"CalendarDateTime::Format;formatting date and time values".
195///
196/// <p>
197/// - <b>Conversions:</b><br>
198/// In the \b Python placeholder syntax specification:
199///
200/// "{" [field_name] ["!" conversion] [":" format_spec] "}"
201///
202/// symbol \c '!' if used before the colon <c>':'</c> defines
203/// what is called the <b>conversion</b>. With \b Python, three options are given:
204/// \c '!s' which calls \c str() on the value, \c '!r' which calls \c repr() and \c '!a' which
205/// calls \c ascii(). This is of course not applicable to this formatter. As a replacement,
206/// this class extends the original specification of that conversion using \c '!'.
207/// The following provides a list of conversions supported. The names given can be abbreviated
208/// at any point and ignore letter case, e.g., \c !Upper can be \c !UP or just \c !u.
209/// In addition, multiple conversions can be given by concatenating them, each repeating
210/// character \c '!'.<br>
211/// The conversions supported are:
212///
213/// - <b>!Upper</b><br>
214/// Converts the contents of the field to upper case.
215///
216/// - <b>!Lower</b><br>
217/// Converts the contents of the field to lower case.
218///
219/// - <b>!Quote[O[C]]</b><br>
220/// Puts quote characters around the field.
221/// Note that these characters are not respecting any optional given field width but instead
222/// are added to such.
223/// An alias name for \!Quote is given with \b !Str. As the alias can be abbreviated to \b !s,
224/// this provides compatibility with the \b Python specification.
225///
226/// In extension to the python syntax specification, one or two optional characters might be
227/// given after the (optionally abreviated) terms "Quote" respectively "str".
228/// If one character is given, this is used as the open and closing character. If two are given,
229/// the first is used as the open character, the second as the closing one.
230/// For example, <b>{!Q'}</b> uses single quotes, or <b>{!Q[]}</b> uses rectangular brackets.
231/// Bracket types <b>'{'</b> and <b>'}'</b> cannot be used with this conversion.
232/// To surround a placeholder's contents in this bracket type, add <b>{{</b> and <b>}}</b>
233/// around the placeholder - resulting in <b>{{{}}}</b>!.
234///
235/// - <b>!ESC[<|>]</b><br>
236/// In its default behavior or if \c '<' is specified, certain characters are converted to escape
237/// sequences.
238/// If \c '>' is given, escape sequences are converted to their (ascii) value.
239/// See #"TEscape;Escape" for details about the conversion
240/// that is performed.<br>
241/// An alias name for \b !ESC< is given with \b !a which provides compatibility
242/// with the \b Python specification.
243/// \note If \b !ESC< is used in combination with \b !Quote, then \b !ESC< should be the first
244/// conversion specifier. Otherwise, the quotes inserted might be escaped as well.
245///
246/// - <b>!Fill[Cc]</b><br>
247/// Inserts as many characters as denoted by the integer type argument.
248/// By default the fill character is space <c>' '</c>. It can be changed with optional character
249/// 'C' plus the character wanted.
250///
251/// - <b>!Tab[Cc][NNN]</b><br>
252/// Inserts fill characters to extend the length of the string to be a multiple of a tab width.
253/// By default the fill character is space <c>' '</c>. It can be changed with optional character
254/// 'C' plus the character wanted. The tab width defaults to \c 8. It can be changed by adding
255/// an unsigned decimal number.
256///
257/// - <b>!ATab[[Cc][NNN]|Reset]</b><br>
258/// Inserts an "automatic tabulator stop". These are tabulator positions that are stored
259/// internally and are automatically extended at the moment the actual contents exceeds the
260/// currently stored tab-position. An arbitrary number of auto tab stop and field width
261/// (see <b>!AWith</b> below) values is maintained by the formatter.
262///
263/// Which each new invocation of #"format::Formatter",
264/// the first auto value is chosen and with each use of \c !ATab or \c !AWidth, the next value is
265/// used.<br>
266/// However the stored values are cleared, whenever \b %Format is invoked on a non-acquired
267/// formatter! This means, to preserve the auto-positions across multiple format invocations,
268/// a formatter has to be acquired explicitly before the format operations and released
269/// afterwards.
270///
271/// Alternatively to this, the positions currently stored with the formatter can be reset with
272/// providing argument \c Reset in the format string.
273///
274/// By default, the fill character is space <c>' '</c>. It can be changed with optional character
275/// 'C' plus the character wanted. The optional number provided gives the growth value by which
276/// the tab will grow if its position is exceeded. This value defaults to \c 3.
277///
278/// Both, auto tab and auto width conversions may be used to increase readability of multiple
279/// output lines. Of course, output is not completely tabular, only if those values that result
280/// in the biggest sizes are formatted first. If a perfect tabular output is desired, the data
281/// to be formatted may be processed twice: Once to temporary buffer which is disposed and then
282/// a second time to the desired output \b %AString.
283///
284/// - <b>!AWidth[NNN|Reset]</b><br>
285/// Increases field width with repetitive invocations of format whenever a field value did not
286/// fit to the actually stored width. Optional decimal number \b NNN is added as a padding value.
287/// for more information, see <b>!ATab</b> above.
288///
289/// - <b>!Xtinguish</b><br>
290/// Does not print anything. This is useful if format strings are externalized, e.g defined
291/// in #"GetResourcePool;library resources". Modifications of such resources
292/// might use this conversion to suppress the display of arguments (which usually are
293/// hard-coded).
294///
295/// - <b>!Replace<search><replace></b><br>
296/// Searches string \p{search} and replaces with \p{replace}. Both values have to be given
297/// enclosed by characters \c '<' and \c '>'. In the special case that \p{search} is empty
298/// (<c><></c>), string \p{replace} will be inserted if the field argument is an empty
299/// string.
300///
301///\I{##########################################################################################}
302/// # Reference Documentation #
303/// @throws <b>alib::format::FMTExceptions</b>
304/// - #"FMTExceptions::ArgumentIndexOutOfBounds"
305/// - #"FMTExceptions::IncompatibleTypeCode"
306/// - #"FMTExceptions::MissingClosingBracket"
307/// - #"FMTExceptions::MissingPrecisionValuePS"
308/// - #"FMTExceptions::DuplicateTypeCode"
309/// - #"FMTExceptions::UnknownTypeCode"
310/// - #"FMTExceptions::ExclamationMarkExpected"
311/// - #"FMTExceptions::UnknownConversionPS"
312/// - #"FMTExceptions::PrecisionSpecificationWithInteger"
313//==================================================================================================
315{
316 //################################################################################################
317 // Protected fields
318 //################################################################################################
319 protected:
320 /// Set of extended placeholder attributes, needed for this type of formatter in
321 /// addition to parent's #"FormatterStdImpl::PlaceholderAttributes".
323 {
324 /// The portion of the replacement field that represents the conversion specification.
325 /// This specification is given at the beginning of the replacement field, starting with
326 /// \c '!'.
328
329 /// The position where the conversion was read. This is set to \c -1 in #"resetPlaceholder".
331
332
333 /// The value read from the precision field. This is set to \c -1 in #"resetPlaceholder".
335
336 /// The position where the precision was read. This is set to \c -1 in #"resetPlaceholder".
338
339 /// The default precision if not given.
340 /// This is set to \c 6 in #"resetPlaceholder", but is changed when specific.
342 };
343
344 /// The extended placeholder attributes.
346
347 //################################################################################################
348 // Public fields
349 //################################################################################################
350 public:
351 /// Storage of sizes for auto-tabulator feature <b>{!ATab}</b> and auto field width feature
352 /// <b>{!AWidth}</b>
354
355 /// The default instance of field #"Sizes". This might be replaced with an external object.
357
358 //################################################################################################
359 // Constructor/Destructor
360 //################################################################################################
361 public:
362 /// Constructs this formatter.
363 /// Inherited field #"DefaultNumberFormat" is initialized to meet the formatting defaults of
364 /// Python.
367
368 /// Clones and returns a copy of this formatter.
369 ///
370 /// If the formatter attached to field
371 /// #"Formatter::Next;*" is of type \b %FormatterStdImpl, then that
372 /// formatter is copied as well.
373 ///
374 /// @returns An object of type \b %FormatterPythonStyle and with the same custom settings
375 /// than this.
376 ALIB_DLL virtual
377 SPFormatter Clone() override;
378
379 /// Resets #"AutoSizes".
380 /// @return An internally allocated container of boxes that may be used to collect
381 /// formatter arguments.
382 virtual BoxesMA& Reset() override { Sizes->Reset(); return Formatter::Reset(); }
383
384
385 //################################################################################################
386 // Implementation of FormatterStdImpl interface
387 //################################################################################################
388 protected:
389 /// Sets the actual auto tab stop index to \c 0.
390 virtual void initializeFormat() override { Sizes->Restart(); }
391
392
393
394 /// Invokes parent implementation and then applies some changes to reflect what is defined as
395 /// default in the Python string format specification.
397 virtual void resetPlaceholder() override;
398
399 /// Searches for \c '{' which is not '{{'.
400 ///
401 /// @return The index found, -1 if not found.
403 virtual integer findPlaceholder() override;
404
405 /// Parses placeholder field in python notation. The portion \p{format_spec} is not
406 /// parsed but stored in member
407 /// #"PlaceholderAttributes;FormatSpec".
408 ///
409 /// @return \c true on success, \c false on errors.
411 virtual bool parsePlaceholder() override;
412
413 /// Parses the format specification for standard types as specified in
414 /// \https{"Format Specification Mini Language",docs.python.org/3.5/library/string.html#format-specification-mini-language}.
415 ///
416 /// @return \c true on success, \c false on errors.
418 virtual bool parseStdFormatSpec() override;
419
420 /// Implementation of abstract method
421 /// #"FormatterStdImpl::writeStringPortion;*".<br>
422 /// While writing, replaces \c "{{" with \c "{" and \c "}}" with \c "}" as well as
423 /// standard codes like \c "\\n", \c "\\r" or \c "\\t" with corresponding ascii codes.
424 ///
425 /// @param length The number of characters to write.
427 virtual void writeStringPortion( integer length ) override;
428
429 /// Processes "conversions" which are specified with \c '!'.
430 ///
431 /// @param startIdx The index of the start of the field written in #targetString.
432 /// \c -1 indicates pre-phase.
433 /// @param target The target string, only if different from field #targetString, which
434 /// indicates intermediate phase.
435 /// @return \c false, if the placeholder should be skipped (nothing is written for it).
436 /// \c true otherwise.
438 virtual bool preAndPostProcess( integer startIdx,
439 AString* target ) override;
440
441
442 /// Makes some attribute adjustments and invokes standard implementation
443 /// @return \c true if OK, \c false if replacement should be aborted.
445 virtual bool checkStdFieldAgainstArgument() override;
446};
447} // namespace [alib::format]
448
449ALIB_EXPORT namespace alib {
450/// Type alias in namespace \b alib.
452}
#define ALIB_DLL
Definition alib.inl:573
#define ALIB_EXPORT
Definition alib.inl:562
virtual integer findPlaceholder() override
virtual void initializeFormat() override
Sets the actual auto tab stop index to 0.
virtual void writeStringPortion(integer length) override
virtual bool preAndPostProcess(integer startIdx, AString *target) override
virtual bool checkStdFieldAgainstArgument() override
AutoSizes SizesDefaultInstance
The default instance of field #"Sizes". This might be replaced with an external object.
PlaceholderAttributesPS placeholderPS
The extended placeholder attributes.
virtual SPFormatter Clone() override
FormatterStdImpl(const String &formatterClassName)
virtual BoxesMA & Reset()
containers::SharedPtr< format::Formatter > SPFormatter
Definition formatter.inl:42
strings::util::AutoSizes AutoSizes
Type alias in namespace alib.
lang::integer integer
Type alias in namespace alib.
Definition integers.inl:149
strings::TSubstring< character > Substring
Type alias in namespace alib.
boxing::TBoxes< MonoAllocator > BoxesMA
Type alias in namespace alib.
Definition boxes.inl:193
strings::TAString< character, lang::HeapAllocator > AString
Type alias in namespace alib.
format::FormatterPythonStyle FormatterPythonStyle
Type alias in namespace alib.
int ConversionPos
The position where the conversion was read. This is set to -1 in #"resetPlaceholder".
int PrecisionPos
The position where the precision was read. This is set to -1 in #"resetPlaceholder".
int Precision
The value read from the precision field. This is set to -1 in #"resetPlaceholder".