Skip to content

CoPilot comment: Replace std::smatch with std::svmatch to eliminate temporary string copy in StdRegexStrategy #342

@daantimmer

Description

In StdRegexStrategy::Match (cucumber_cpp/library/cucumber_expression/StdRegexStrategy.cpp), the input std::string_view text is currently copied into a local std::string textStr solely because std::regex_search with std::smatch requires lvalue string iterators. This allocation is unnecessary if we switch to std::regex_search overloads that work on arbitrary iterator ranges paired with std::match_results<std::string_view::const_iterator> (aka std::svmatch in C++20, or std::match_results<const char*>/std::cmatch).

Current code:

std::optional<std::vector<std::optional<MatchGroup>>> StdRegexStrategy::Match(std::string_view text) const
{
    std::string textStr{ text };  // unnecessary copy
    std::smatch match;

    if (!std::regex_search(textStr, match, regex))
        return std::nullopt;

    // ... extract groups from match ...
}

Proposed approach:

std::optional<std::vector<std::optional<MatchGroup>>> StdRegexStrategy::Match(std::string_view text) const
{
    std::cmatch match;  // match_results<const char*>

    if (!std::regex_search(text.data(), text.data() + text.size(), match, regex))
        return std::nullopt;

    // positions now refer directly into the original text buffer
    // match[i].str() still returns std::string copies for value
    // match.position(i) and match.length(i) work as before
}

Using std::cmatch (i.e. std::match_results<const char*>) avoids the temporary string entirely. The regex_search(const char*, const char*, cmatch&, regex) overload is well-defined and the resulting cmatch references the original text data directly — no lifetime concerns as long as the match is consumed before text goes out of scope (which it already is).

File: cucumber_cpp/library/cucumber_expression/StdRegexStrategy.cpp

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions