Tag: python

  • Better Enumerations in Python

    When dealing with more than one language, we look for similar approaches across languages. In Rust, there’s a keyword called enum to define and use enumerations. It’s easy to use, let’s look at this example from the official document:

    enum IpAddrKind {
        V4,
        V6,
    }
    
    struct IpAddr {
        kind: IpAddrKind,
        address: String,
    }
    
    let home = IpAddr {
        kind: IpAddrKind::V4,
        address: String::from("127.0.0.1"),
    };
    
    let loopback = IpAddr {
        kind: IpAddrKind::V6,
        address: String::from("::1"),
    };
    Rust

    If we want to write this code in Python, we can define two classes and two variables simply. Let’s start with a simple alternative first:

    class IpAddrKind:
        V4 = 1
        V6 = 2
    
    
    class IpAddr:
        def __init__(self, kind: IpAddrKind, address: str) -> None:
            self.kind = kind
            self.address = address
    
    
    home = IpAddr(kind=IpAddrKind.V4, address="127.0.0.1")
    loopback = IpAddr(kind=IpAddrKind.V6, address="::1")
    Python

    It’s absolutely not the same thing with the Rust one but it can help you to meet your expectations about keeping the parameter consistencies between the IP address objects and you can group or filter the objects when you need it. I could also use dataclass for IpAddr, it came with the version 3.7:

    import dataclasses
    
    
    @dataclasses.dataclass(frozen=True)
    class IpAddrKind:
        V4 = 1
        V6 = 2
    
    
    @dataclasses.dataclass
    class IpAddr:
        kind: IpAddrKind
        address: str
    Python

    But as you know, they’re not enumerated indeed. Let’s discuss why we need a specific keyword or a language feature for enumerations:

    • The enumeration class should only do one thing. Currently it’s just a class, there’s nothing to explain that it’s an enumerator. Even if we set the values manually, we don’t know which are enumerated types. There’s no relationship between the members of the class.
    • When I read the code, I should quickly understand that it’s an enum with unique constants, so I can use these constants to define types of some objects and compare them. In other words, if I have an enum type for colors and I want check the type of a color, it should be a Color for example. Not integer, not string, I expect a custom enum type.
    • All enumerations have common basic functionalities. Members should be immutable, iterable, accessible by value and by name, and it should also be possible to get members as a list. I should not have to create a base class to add them to.

    Actually there’s an Enum type since version 3.4. But the feeling of using enum in Python is like using an external package like dateutil. You’re importing Enum class, and inheriting it on your enum type. Let me show:

    from enum import Enum
    import dataclasses
    
    
    class IpAddrKind(Enum):
        V4 = 1
        V6 = 2
    
    
    @dataclasses.dataclass
    class IpAddr:
        kind: IpAddrKind
        address: str
    Python

    I think it’s still better than using dataclass, because it gives some expectations of what we want, there’s a specific enum type, there are value and name attributes for each member, the values of members are immutable, if you try to redefine it, it will give an AttributeError, etc. All is fine.

    >>> home = IpAddr(kind=IpAddrKind.V4, address="127.0.0.1")
    
    # type type is enum.
    >>> home.kind
    <IpAddrKind.V4: 1>
    
    >>> type(home.kind)
    <enum 'IpAddrKind'>
    
    # there's a value and the type of value is integer.
    # the best practice is not using the value directly, because 1 is meaningless.
    >>> home.kind.value
    1
    
    >>> type(home.kind.value)
    <class 'int'>
    
    # it's also possible to access to name of member.
    # sometimes we need it if we're getting a data from an external service and want to remap the values, it depends.
    >>> home.kind.name
    'V4'
    Python (REPL)

    Technically, we have achieved what we want, but let’s think about what we can do about usability. Do I have to assign a numerical value to the members? Partially yes, it’s not possible to define just member names without values, as in Rust. But there are some options:

    from enum import auto, Enum
    
    
    # option 1
    class IpAddrKind(Enum):
        V4, V6 = range(1, 3)
    
    
    # option 2: functional way
    IpAddrKind = Enum("IpAddrKind", "V4 V6")
    
    
    # option 3: just assign the first value
    class IpAddrKind(Enum):
        V4 = 1
        V6 = auto()  # or you can leave it as `None`
    Python

    I’m not sure which option is better, I guess the first one doesn’t improve readability, option 2 is better than 1. Also I would prefer to write the values manually instead of auto() in option 3. But yeah, it depends again. It’s also possible to set the value of a member by its name. I haven’t written all the features of the enum module in this article, but it’s also possible to support bitwise operators using Flag instead of Enum, to determine the member values str instead of int using StrEnum, or to force the value types to be numbers using IntEnum, etc. See the official documentation for more information.


    Before I conclude my article, I would like to add one last thing about enum use in Rust. I remember that it’s possible to use algebraic data types in Rust, but you probably won’t need it in Python. Because Rust is not a dynamically typed language and it forces you to work with types every time. Actually there’s a better example in Kobzol’s blog (see the section Algebraic data types) but let’s keep in the same sample here:

    enum IpAddr {
        V4(String),
        V6(String),
    }
    
    let home = IpAddr::V4(String::from("127.0.0.1"));
    let loopback = IpAddr::V6(String::from("::1"));
    Rust

    How could we write it in Python? Should we use Enum or dataclass, or just a class?

  • Syntax Errors in Logical Operators

    Python favours letters and meaningful keywords for the logical operators, and all of them return a value depending on the conditions:

    1. not: The opposite of a condition, the reverse of the logical state of its operand. If the condition is True, it returns False, or vice versa.
    2. and: It needs to compare two operands, It returns True if both operands are True.
    3. or: It’s similar to and, but it returns True if at least one of the operands is True.

    The operands do not have to be boolean, Python will return the condition itself through the comparison logic to keep the code writing simple. Now let’s look at this output to see the difference between not and the other operators, focusing on the position of the hats (^):

    >>> not
      File "<stdin>", line 1
        not
           ^
    SyntaxError: invalid syntax
    
    >>> and
      File "<stdin>", line 1
        and
        ^^^
    SyntaxError: invalid syntax
    
    >>> or
      File "<stdin>", line 1
        or
        ^^
    SyntaxError: invalid syntax
    
    >>> or 0
      File "<stdin>", line 1
        or 0
        ^^^
    SyntaxError: invalid syntax

    The syntax error starts in the first character of and and or, but the end of not, after t. Is this normal? Absolutely, yes. As I said before, we don’t need a comparison for using not, but we can not say the same thing for the others, the keyword should be between in two conditions. Now let’s try to give a condition for not to solve the syntax error:

    >>> bool(1)
    True
    
    >>> not 1
    False
    
    >>> bool(0)
    False
    
    >>> not 0
    True

    1 is a truthy value, it’s something like YES or ON in programming logic, but it returned False when we added not keyword before 1. I am aware that I am explaining very simple and boring things but please be patient. Let’s make the syntax a bit confusing:

    >>> not(1)
    False
    
    >>> not(0)
    True
    
    >>> not()
    True
    
    >>> bool()
    False

    Wait.. Is there also a built-in function for not? No, it’s definitely not a function. Honestly, I would expect a syntax error because of the missing space between the not keyword and the parentheses, but the formatter tools will probably add it automatically anyway. So, you can consider using parentheses to group the conditions, that’s all. And, when you ran not with an empty parentheses, it will return always True because empty tuples, lists, dictionaries are falsy values:

    >>> not{}
    True
    
    >>> not[]
    True
    
    >>> not      ()
    True

    The best thing to do is to use a code formatter and read the code that way. This is not always possible, so we have to get our eyes used to strange syntax. Let me end the article with a more ugly syntax (but meaningful):

    if not(False or False) and(True):
        print("This will print")

    Will it print the expected output, or raise a syntax error?

  • String quotes in PEP8

    I use PEP8 in my Python projects except for a few rules.

    I think ‘single-quote’ and “double-quote” characters should NOT be the same. It could be better if these two characters had different counterparts in the Python language parser system. So we could use one of them for the template-strings (or f-strings). It doesn’t provide readability, causes inconsistency instead. Until now, I used string-quotes for the variable definitions and double-quotes for the visible texts. But now, I plan to use always double-quote as in many other programming languages.