Issue #2: no warning with invalid pd.set_option('display.max_colwidth') - matifaro/pandas GitHub Wiki

Issue - no warning with invalid pd.set_option('display.max_colwidth')

The issue

When users set display.max_colwidth to 3 or less, pandas silently ignores the setting (no truncation occurs) and emits no warning—even though the ellipsis ("...") alone occupies three characters—leading to confusion and a lack of feedback that the option is ineffective. In the following code snippet we can see a reproduction of this issue:

import numpy as np
import pandas as pd

df = pd.DataFrame([
    ['foo', 'bar', 'bim', 'uncomfortably long string'],
    ['horse', 'cow', 'banana', 'apple']
])

# No truncation at all, despite setting to 2
pd.set_option('display.max_colwidth', 2)
print(df)
#            0    1       2                          3
# 0        foo  bar     bim  uncomfortably long string
# 1      horse  cow  banana                      apple

# Truncation only begins at 6 (ellipsis + 3 chars)
pd.set_option('display.max_colwidth', 6)
print(df)
#           0    1      2      3
# 0       foo  bar    bim  un...
# 1     horse  cow  ba...  apple

Note: Setting to 2 or 3 produces exactly the same output as the default.

Problem Description

  • Silent failure: Values of max_colwidth below 4 are accepted but have no effect on how strings are displayed.
  • No warning or error: Users receive zero feedback that their configuration is invalid or ineffective.
  • Undocumented lower bound: The documentation does not specify that meaningful values must be ≥ 4 (to accommodate the three-character ellipsis).

The root cause lies in the fact that pandas configuration machinery uses a simple validator (is_instance_factory([type(None), int])) that checks only the type (i.e., “is it an int or None?”) and does not enforce any minimum value. As a result, out-of-range integers (including 0, negative values, or values less than the ellipsis length) pass validation and are applied without error or warning.

Requirements

Here are the key requirements to fully address and close this issue:

  • Enforce a meaningful minimum value
    – Update the validator for display.max_colwidth so that it only accepts integers > 4, since the three-character ellipsis alone consumes three slots.
  • Emit a warning for too‐small values
    – In the validator if the value is not None and value < 4, it raises a ValueError whenever someone sets an ineffective width.
  • Register the improved option
    – In pandas/_config/config_init.py, call cf.register_option for "display.max_colwidth" with:
    validator=validate_max_colwidth,         # ensures < 4
    
  • Documentation update
    – Amend the max_colwidth docstring and user‐facing docs to clearly state that values below 4 will not meaningfully truncate output (and that the ellipsis itself is three characters long).
  • Comprehensive tests
    – Add unit tests to verify:
    – Ensure invalid types (e.g. strings) still raise the appropriate TypeError or ValidationError.
  • Backward‐compatibility & cleanup
    – Confirm that existing behavior for None (no limit) and large integer values remains unchanged, and remove any vestigial validators or callbacks that are now superseded.

Design of the fix

  1. New validator
    In pandas/_config/config.py, add:

    def validate_max_colwidth(x: int | None) -> int | None:
        """
        Allow None or any integer ≥ 4; raise ValueError if x < 4.
        """
        if x is not None and x < 4:
            raise ValueError("Value must be bigger than 3")
        return x
    
  2. Hook up the validator
    In pandas/core/config_init.py, replace the old is_nonnegative_int validator for display.max_colwidth with our new one:

    - cf.register_option(
    -     "display.max_colwidth",
    -     50,
    -     max_colwidth_doc,
    -     validator=is_nonnegative_int,
    - )
    + from pandas._config.config import validate_max_colwidth
    +
    + cf.register_option(
    +     "display.max_colwidth",
    +     50,
    +     max_colwidth_doc,
    +     validator=validate_max_colwidth,
    + )
    
  3. Update the unit test
    In tests/io/formats/test_format.py, extend test_max_colwidth_negative_int_raises so it also fails for values below 4:

    def test_max_colwidth_negative_int_raises(self):
        with pytest.raises(ValueError, match="Value must be bigger than 3"):
            with option_context("display.max_colwidth", -1):
                pass
        with pytest.raises(ValueError, match="Value must be bigger than 3"):
            with option_context("display.max_colwidth", 3):
                pass
    

With these three changes, any attempt to set display.max_colwidth to 3 or less will immediately raise the intended ValueError, closing out the issue as designed.

Source code files

The source files modified on this issue were:

  • Creating the exception:
    • pandas/_config/config.py - create function validate_max_colwidth
  • Changing the function
    • pandas/core/config_init.py - adapt validator to a more restrictive one (the new function)
cf.register_option(
        "max_colwidth",
        50,
        max_colwidth_doc,
        validator=is_nonnegative_int,
    )
  • Testing
    • tests/io/formats/test_format.py - changing test in function test_max_colwidth_negative_int_raises to new condition
    def test_max_colwidth_negative_int_raises(self):
        # Deprecation enforced from:
        # https://github.com/pandas-dev/pandas/issues/31532
        with pytest.raises(
            ValueError, match="Value must be a nonnegative integer or None"
        ):
            with option_context("display.max_colwidth", -1):
                pass

Fix source code

We start by creating the new validator in the file pandas/_config/config.py that will catch the situation we want to mitigate.

def validate_max_colwidth(x: int | None) -> int | None:
    if x is not None and x < 4:
        raise ValueError("Value must be bigger than 3")
    return x

Now we can adapt the configuration of this function in the file pandas/core/config_init.py to catch the exception.

cf.register_option(
        "max_colwidth",
        50,
        max_colwidth_doc,
        validator=validate_max_colwidth,
    )

To ensure that everything is working properly we need to update the test files to this new condition. We update the function test_max_colwidth_negative_int_raises in the file tests/io/formats/test_format.py.

    def test_max_colwidth_negative_int_raises(self):
        # Deprecation enforced from:
        # https://github.com/pandas-dev/pandas/issues/31532
        with pytest.raises(
            ValueError, match="Value must be bigger than 3"
        ):
            with option_context("display.max_colwidth", -1):
                pass
            with option_context("display.max_colwidth", 3):
                pass

Submit the fix

Currently we haven't submitted this fix yet given that we are still waiting on a response from the head of the collaborators.