pod/perlguts pod/perlhacktips - various updates and new content

Perl · Jan 1, 2025 · f7bacd1 · f7bacd1
1 parent 8f5aa22
commit f7bacd1
Show file tree

Hide file tree

Showing 2 changed files with 181 additions and 23 deletions.
diff --git a/pod/perlguts.pod b/pod/perlguts.pod
@@ -60,6 +60,8 @@ may not be usable in all circumstances.
 A numeric constant can be specified with L<perlapi/C<INT16_C>>,
 L<perlapi/C<UINTMAX_C>>, and similar.
 
+See also L<perlhacktips/"Portability problems">.
+
 =for apidoc_section $integer
 =for apidoc  Ayh ||IV
 =for apidoc_item ||I8
@@ -2943,8 +2945,32 @@ The context-free version of Perl_warner is called
 Perl_warner_nocontext, and does not take the extra argument.  Instead
 it does C<dTHX;> to get the context from thread-local storage.  We
 C<#define warner Perl_warner_nocontext> so that extensions get source
-compatibility at the expense of performance.  (Passing an arg is
-cheaper than grabbing it from thread-local storage.)
+compatibility at the expense of performance.  Passing an arg is
+much cheaper and faster than grabbing it with from the OS's thread-local
+storage API with function calls.
+
+But consider this, if there is a choice between C<Perl_croak> and
+C<Perl_croak_nocontext> which one do you pick?  Which one is
+more efficient?  Is it even possible to make the C<if(assert_failed)> test true
+and enter conditional branch with C<Perl_croak>?
+
+Maybe only from a test file.  Maybe not.  Your C<Perl_croak> branch is probably
+unreachable until you add a new bug.  So the performance of
+C<Perl_croak_nocontext> compared to C<Perl_croak>, doesn't matter.  The C<dTHX;>
+call inside the slower C<Perl_croak_nocontext>, will never execute in anyone's
+normal control flow.  If the error branch never executes, optimize what does
+execute. By removing the C<aTHX> arg, you saved 4-12 bytes space and 1-3 CPU
+assembly ops on a cold branch, by pushing 1 less variable onto the C stack
+inside the call expression invoking C<Perl_croak_nocontext>, instead of
+C<Perl_croak>. The CPU has less to jump over now.
+
+The rational of C<Perl_croak_nocontext> is better than C<Perl_croak> is only
+in the case of C<Perl_croak>, and nowhere else except for the deprecated
+C<Perl_die_nocontext> C<Perl_die> pair and 3rd case of C<Perl_warn>.
+C<Perl_warn> is debateable.
+
+It doesn't apply to C<Perl_form> C<Perl_mess> or keyword
+C<Perl_op_die(OP * op)>, which could be normal control flow.
 
 You can ignore [pad]THXx when browsing the Perl headers/sources.
 Those are strictly for use within the core.  Extensions and embedders
@@ -2971,11 +2997,12 @@ argument somehow.  The kicker is that you will need to write it in
 such a way that the extension still compiles when Perl hasn't been
 built with MULTIPLICITY enabled.
 
-There are three ways to do this.  First, the easy but inefficient way,
-which is also the default, in order to maintain source compatibility
-with extensions: whenever F<XSUB.h> is #included, it redefines the aTHX
-and aTHX_ macros to call a function that will return the context.
-Thus, something like:
+There are three ways to do this.  First, the easist way, is using Perl's legacy
+code compatibility layer, which is also the default. Production grade code
+and code intended for CPAN should never use this mode. In order to maintain
+source compatibility with very old extensions: whenever F<XSUB.h> is #included,
+it redefines the aTHX and aTHX_ macros to call a function that will return the
+context. Thus, something like:
 
         sv_setiv(sv, num);
 
@@ -2990,7 +3017,9 @@ or to this otherwise:
 
 You don't have to do anything new in your extension to get this; since
 the Perl library provides Perl_get_context(), it will all just
-work.
+work, but each XSUB will be much slower. Benchmarks have shown using the
+compatibility layer and Perl_get_context(), takes 3x more wall time in the best
+case, and 8.5x worst case.
 
 The second, more efficient way is to use the following template for
 your Foo.xs:

diff --git a/pod/perlhacktips.pod b/pod/perlhacktips.pod
@@ -53,25 +53,101 @@ supported"> for further discussion about context.
 
 Not compiling with -DDEBUGGING
 
-The DEBUGGING define exposes more code to the compiler, therefore more
-ways for things to go wrong.  You should try it.
+The DEBUGGING define exposes more code to the compiler and turns on Perl's
+asserts, therefore more ways for things to go wrong.  A Perl built with
+the C<DEBUGGING> define will be visibly slower in the shell and every other
+subsystem.  C<DEBUGGING> is only for development of XS modules or core code,
+never production running, but its maximum error checking is crucial for
+good new code. You should try it.
 
 =item *
 
-Introducing (non-read-only) globals
-
-Do not introduce any modifiable globals, truly global or file static.
-They are bad form and complicate multithreading and other forms of
-concurrency.  The right way is to introduce them as new interpreter
-variables, see F<intrpvar.h> (at the very end for binary
-compatibility).
+Introducing (non-read-only) globals and statics
+
+Do not introduce any modifiable C globals, truly visible global variables
+declared with extern visible or per C file globals declared with C<static>
+visibility. They are bad form, and not memory safe with complicate multithreading
+and other forms of concurrency. XS modules have a dedicated simple API to create
+their own, Perl threading safe global variables, see
+L<perlxs/Safely Storing Static Data in XS>. But the interpreter core can't use
+that API.
+
+The interpreter currently does not use any atomic intrinsic functions offered
+by a C compiler. Instead Perl's thread safe serialization, is done with an
+internal API with names like C<MUTEX_INIT()> and C<MUTEX_LOCK()> .
+
+Historically, atomic operations didn't exist on most CPU archs that Perl uses.
+If they existed, atomic APIs were always OS and vender specific, and never
+portable.
+
+As of 5.35.5, perl dropped support for a strict C89 compiler and moved to
+a minimum requirement of C89+some C99. See L</C99>. C11 standardized some
+atomics for the first time in the optionally implemented C<stdatomic.h>.
+Patches are welcome to add a portable atomic API, with fallbacks to
+C<MUTEX_LOCK()>.
+
+The right way to introduce a new C global variable, usually will be to add
+it as a new interpreter variable. See F<intrpvar.h>.  Since 5.10.0, adding
+or removing or changed the size of any interpreter variable, is not supported
+and undefined behavior.  Recompiling XS modules is required.
+
+There are some loopholes to this policy if you are writing unstable
+experiments.  These loopholes can never be used, in stable code, for the
+interpreter, or XS modules.  The loopholes may temporarily work, just long
+enough, to finish the experiment.  Remember, failure to get a C<SEGV>, or
+failure to get fatal C<panic:> error, doesn't mean you didn't introduce a bug,
+or corrupt a random malloc() block.
+
+Between 5.10.0, and upto 5.21.5, there was a provision, that adding 1 new
+variable at the end of F<intrpvar.h> as the very last member, was always binary
+compatible with older XS modules. This was intended only for stable
+maintenance releases. Ex, new maintenance release 5.18.1, loading an XS module
+compiled against header files from 5.18.0. Remember a newer 5.18.1 core,
+loading an XS binary compiled against 5.17.10 or 5.16.0, isn't allowed.
+
+So if cutting off current struct members in F<intrpvar.h>, didn't introduce a
+crash, you saved some time in your experiment and it was good luck.
+
+Starting with 5.21.6, stricter checking was added, to match the definition of
+F<intrpvar.h> as understood by each build of the perl interpreter binary or the
+C<libperl> binary, against the definition of F<intrpvar.h> as understood,
+when the XS module's shared library file was compiled.
+
+The exact sanity check requires struct length of C<PerlInterpreter *> aka
+C<my_perl> to be C<sizeof(PerlInterpreter)> or C<sizeof(*my_perl)>
+identical between Core and an XS module, regardless if its a non-threaded or
+threaded build of perl.  If the C compile time byte lengths don't match at
+runtime, L<perldiag/"%s: loadable library and perl binaries are mismatched (got %s handshake key %p, needed %p)">
+error happens.
+
+For 5.21.6 and up, to avoid recompiling XS, if you want to add a new interpreter
+global variable while hacking on the interpreter, is to rename, repurpose, or
+make into union, a current variable from F<intrpvar.h> without change its size,
+alignment, and offset.
+
+Something easier, if speed doesn't matter, put your new experimental pointer or
+integer, into the former backend of C<MY_CXT_INIT>. It is an C<HV*> named
+L<perlapi/"PL_modglobal">.
+
+If speed is important, add a new pointer member to F<intrpvar.h> just once in your
+branch, recompile all your XS modules once, and always keep the private patch
+in your repo. Shrinking or growing the length of a pointer from C<Newx()>,
+doesn't trip the 5.21.6 and up interpreter global struct size check.
+
+Take a look the backend of the C<MY_CXT_INIT> API. The backend is
+2 variables, C<PL_my_cxt_list> and C<PL_my_cxt_size>. Nothing prevents
+the C<perl_construct>, C<perl_clone_using>, C<perl_destruct> group being
+changed to always take ownership of index 0 of array of C<void *>s that is
+stored at C<PL_my_cxt_list>, before the first call to C<newXS()> or PP code.
 
 Introducing read-only (const) globals is okay, as long as you verify
 with e.g. C<nm libperl.a|egrep -v ' [TURtr] '> (if your C<nm> has
 BSD-style output) that the data you added really is read-only.  (If it
 is, it shouldn't show up in the output of that command.)
 
-If you want to have static strings, make them constant:
+Const static strings are less efficient than double quoted string literal.
+But if you really want to have static strings, at minimum, make sure they are
+declared with constant:
 
   static const char etc[] = "...";
 
@@ -81,14 +157,60 @@ right combination of C<const>s:
     static const char * const yippee[] =
         {"hi", "ho", "silver"};
 
+C requires that C<static const char []> arrays have unique addresses in an
+equality test.  The linker is prohibited from merging and de-duplicating
+const static arrays with identical length and data content.  This is B<not> true
+for double quoted C string literals.  C string literals are efficiently de-duped
+by linkers.  If a string literal is very long, or its contents decrease
+readability of other code, and you desire an alternate token or symbol for that
+string, use a C<#define Msg "long Msg">.  2 references to C<"..."> will
+always get merged to 1 copy stored in the binary image.
+
+  static const char etc[] = "...";
+
+This will never be merge in the final binary.  In this case, there would be
+2 copies of C<"..."> at different 2 addresses, each taking 4 bytes, inside one
+C<perl.bin> or C<libperl.so> or XS binary.
+
+Sometimes this inefficiency is a feature.  Its goes as such.  Declare a
+C<static const char []> array, and place the pointer to that static array,
+into a larger global-like or malloc-ed structure, and return control.  Sometime
+later, you regain control, and you check a global-like or malloc-ed structure.
+Is the C<const char *> still the same address as your C<static const char []>
+array or not?  This can be used as tag or flag or status, if you see the same
+address or not in the future.
+
+Because of guarenteed different address, any arbitrary core or XS code that
+overwrites the C<const char *> member, with an identical contents, C<"">
+literal, would be detected.
+
+Perl uses this method inside C<L<perlapi/"PL_sv_yes">>,
+C<L<perlapi/"PL_sv_no">>, and C<L<perlapi/"PL_sv_zero">>. These 3 set
+C<SvPVX> to exported, const char arrays, C<PL_Yes>, C<PL_No>, and C<PL_Zero>.
+The addresses of 3 const char arrays, have special meaning, and will never test
+C<==> true against the address of a string literal with the same contents.
+
 =item *
 
 Not exporting your new function
 
 Some platforms (Win32, AIX, VMS, OS/2, to name a few) require any
-function that is part of the public API (the shared Perl library) to be
-explicitly marked as exported.  See the discussion about F<embed.pl> in
-L<perlguts>.
+function or any const or read-write, process global data variable that is part
+of the public API (the shared Perl library) to be explicitly marked as exported.
+C symbols do not cross between different binary disk files on these platforms
+unless explicit exported.  If a public API macro that uses a non-public API
+function or process global variable, the non-public API C symbol has to be
+exported so the OS shared library runtime linkers can load XS modules.
+
+Start in 5.37.1, support for C<__attribute__((visibility("hidden")))> was added.
+This brought explicit export marking shared library C symbol semantics to
+almost all compilers and platforms.  This greatly helps if the compiler has LTO
+since heuristic automatic inlining of any function is possible, along with
+not static, not exported marked, unused functions.
+
+See the discussion about F<embed.pl> in L<perlguts>.  Export marking is done
+by editing F<embed.fnc> for functions.b For data variables, export marking,
+is through F<perl.h>, F<globvar.sym>, and F<perlvars.h>.
 
 =item *
 
@@ -609,6 +731,12 @@ to be I<exactly> 32 bits (they are I<at least> 32 bits), nor are they
 guaranteed to be C<int> or C<long>.  If you explicitly need 64-bit
 variables, use C<I64> and C<U64>.
 
+If you are writing CPAN code, you need to support older compilers and Perls
+without 64-bit intergers. For CPAN only you must check the HAS_QUAD define and
+guard off your C<I64> and C<U64> code if they aren't implemented on that system.
+
+See L<perlguts/"What is an E<quot>IVE<quot>?">
+
 =item *
 
 Assuming one can dereference any type of pointer for any type of data
@@ -626,7 +754,8 @@ Lvalue casts
   (int)*p = ...;    /* BAD */
 
 Simply not portable.  Get your lvalue to be of the right type, or maybe
-use temporary variables, or dirty tricks with unions.
+use temporary variables, C<*(int*)&p = ...;>, or dirty tricks with unions.
+Remember about alignment, size, and compiling as C++.
 
 =item *
 
@@ -1310,7 +1439,7 @@ similar output to CPAN module L<B::Debug>.
 
 # finish this later #
 
-=head2 Using gdb to look at specific parts of a program
+=head2 Using gdb to look at specific parts of Perl code
 
 With the example above, you knew to look for C<Perl_pp_add>, but what
 if there were multiple calls to it all over the place, or you didn't