Sphinxsearch results weights with different rankers - sphinx

I have one index "name_and_title_index" with two fields "name" and "title".
Indextool gives me this information on interested keywords:
keyword ,docs ,hits ,offset
word7 ,56 ,57 ,519386707
word8 ,154 ,161 ,475390304
word2 ,2438 ,2597 ,14258546
word3 ,26599 ,29074 ,68018978
word5 ,475349 ,656569 ,191390685
word1 ,645079 ,881965 ,303666122
word6 ,1089457 ,1435180 ,350540391
indexed_documents - 10742342, total keywords - 1379888
It seems to me I do not understand rankers since all off them returns results in different order than I've expect.
I expect any result with word7 would have higher weight (there is only 56 docs out of 10.7M)
The SphinxQL is:
SELECT
ID,
WEIGHT(),
SNIPPET(name, 'word1 word2 word3 word4 word5 word6') AS _name,
SNIPPET(title, 'word7 word8 word9') AS _title
FROM
name_and_title_index
WHERE
MATCH('#name "word1 word2 word3 word4 word5 word6"/0.5 #title "word7 word8 word9"/0.5')
Different rankers gives me next results:
RANKER=PROXIMITY_BM25;
| 1 | 6546 | _ <b>word6</b> <b>word1</b> <b>word2</b> <b>word3</b> | _ _ <b>word8</b> _ _ <b>word7</b> |
| 4 | 6528 | _ _ _ _ _ _ _ _ <b>word2</b> <b>word3</b> <b>word4</b> _ | _ _ <b>word8</b> _ _ _ _ _ ... |
| 2 | 4521 | <b>word5</b> <b>word6</b> _ _ _ _ _ _ <b>word1</b> _ _ | _ <b>word7</b> _ _ _ _ _ _ _ _ ... |
| 3 | 4520 | <b>word5</b> _ <b>word1</b> _ _ _ _ _ <b>word6</b> _ _ | _ _ _ _ _ _ _ _ _ _ _ _ <b>word7</b> |
| 5 | 4519 | <b>word1</b> _ _ _ _ _ <b>word5</b> <b>word6</b> _ _ _ _ | _ _ _ _ _ _ <b>word8</b> _ _ _ _ _ _ |
| 6 | 2520 | <b>word5</b> _ _ _ _ _ ... _ _ _ _ <b>word6</b> _ _ _ _ _ ... | ... _ _ _ _ _ _ _ <b>word8</b> _ _ |
RANKER=BM25;
| 1 | 2546 | _ <b>word6</b> <b>word1</b> <b>word2</b> <b>word3</b> | _ _ <b>word8</b> _ _ <b>word7</b> |
| 4 | 2528 | _ _ _ _ _ _ _ _ <b>word2</b> <b>word3</b> <b>word4</b> _ | _ _ <b>word8</b> _ _ _ _ _ ... |
| 2 | 2521 | <b>word5</b> <b>word6</b> _ _ _ _ _ _ <b>word1</b> _ _ | _ <b>word7</b> _ _ _ _ _ _ _ _ ... |
| 3 | 2520 | <b>word5</b> _ <b>word1</b> _ _ _ _ _ <b>word6</b> _ _ | _ _ _ _ _ _ _ _ _ _ _ _ <b>word7</b> |
| 5 | 2520 | <b>word1</b> _ _ _ _ _ <b>word5</b> <b>word6</b> _ _ _ _ | _ _ _ _ _ _ <b>word8</b> _ _ _ _ _ _ |
| 6 | 2519 | <b>word5</b> _ _ _ _ _ ... _ _ _ _ <b>word6</b> _ _ _ _ _ ... | ... _ _ _ _ _ _ _ <b>word8</b> _ _ |
RANKER=SPH04;
| 4 | 16528 | _ _ _ _ _ _ _ _ <b>word2</b> <b>word3</b> <b>word4</b> _ | _ _ <b>word8</b> _ _ _ _ _ ... |
| 1 | 14546 | _ <b>word6</b> <b>word1</b> <b>word2</b> <b>word3</b> | _ _ <b>word8</b> _ _ <b>word7</b> |
| 2 | 14521 | <b>word5</b> <b>word6</b> _ _ _ _ _ _ <b>word1</b> _ _ | _ <b>word7</b> _ _ _ _ _ _ _ _ ... |
| 3 | 14520 | <b>word5</b> _ <b>word1</b> _ _ _ _ _ <b>word6</b> _ _ | _ _ _ _ _ _ _ _ _ _ _ _ <b>word7</b> |
| 5 | 14519 | <b>word1</b> _ _ _ _ _ <b>word5</b> <b>word6</b> _ _ _ _ | _ _ _ _ _ _ <b>word8</b> _ _ _ _ _ _ |
| 6 | 10520 | <b>word5</b> _ _ _ _ _ ... _ _ _ _ <b>word6</b> _ _ _ _ _ ... | ... _ _ _ _ _ _ _ <b>word8</b> _ _ |
Why result 4 is always higher than result 2 and 3 (and with SPH04 it is higher than result 1)?

Related

Freezegun's freeze_time throws odd transformers error when used

I have a test that seems to throw an odd error when freeze_time is called. Reading the stacktrace, it's most likely interacting with another dependency in an odd way, but I don't have enough experience with python to understand what the problem may be.
self = <module 'transformers.models.transfo_xl' from '/Users/rbhalla/dev/second/server/.venv/lib/python3.10/site-packages/transformers/models/transfo_xl/__init__.py'>, module_name = 'tokenization_transfo_xl'
def _get_module(self, module_name: str):
try:
> return importlib.import_module("." + module_name, self.__name__)
.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:905:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
name = '.tokenization_transfo_xl', package = 'transformers.models.transfo_xl'
def import_module(name, package=None):
"""Import a module.
The 'package' argument is required when performing a relative import. It
specifies the package to use as the anchor point from which to resolve the
relative import to an absolute import.
"""
level = 0
if name.startswith('.'):
if not package:
msg = ("the 'package' argument is required to perform a relative "
"import for {!r}")
raise TypeError(msg.format(name))
for character in name:
if character != '.':
break
level += 1
> return _bootstrap._gcd_import(name[level:], package, level)
../../../.pyenv/versions/3.10.3/lib/python3.10/importlib/__init__.py:126:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
name = 'transformers.models.transfo_xl.tokenization_transfo_xl', package = 'transformers.models.transfo_xl', level = 1
> ???
<frozen importlib._bootstrap>:1050:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
name = 'transformers.models.transfo_xl.tokenization_transfo_xl', import_ = <function _gcd_import at 0x10d54b400>
> ???
<frozen importlib._bootstrap>:1027:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
name = 'transformers.models.transfo_xl.tokenization_transfo_xl', import_ = <function _gcd_import at 0x10d54b400>
> ???
<frozen importlib._bootstrap>:1006:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
spec = ModuleSpec(name='transformers.models.transfo_xl.tokenization_transfo_xl', loader=<_frozen_importlib_external.SourceFil...bhalla/dev/second/server/.venv/lib/python3.10/site-packages/transformers/models/transfo_xl/tokenization_transfo_xl.py')
> ???
<frozen importlib._bootstrap>:688:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <_frozen_importlib_external.SourceFileLoader object at 0x15c13db10>
module = <module 'transformers.models.transfo_xl.tokenization_transfo_xl' from '/Users/rbhalla/dev/second/server/.venv/lib/python3.10/site-packages/transformers/models/transfo_xl/tokenization_transfo_xl.py'>
> ???
<frozen importlib._bootstrap_external>:883:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
f = <built-in function exec>
args = (<code object <module> at 0x15c185d10, file "/Users/rbhalla/dev/second/server/.venv/lib/python3.10/site-packages/trans...ns.Counter'>, 'List': typing.List, 'Optional': typing.Optional, 'OrderedDict': <class 'collections.OrderedDict'>, ...})
kwds = {}
> ???
<frozen importlib._bootstrap>:241:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
# coding=utf-8
# Copyright 2018 Google AI, Google Brain and Carnegie Mellon University Authors and the HuggingFace Inc. team.
# Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Tokenization classes for Transformer XL model. Adapted from https://github.com/kimiyoung/transformer-xl.
"""
import glob
import os
import pickle
import re
from collections import Counter, OrderedDict
from typing import List, Optional, Tuple
import numpy as np
> import sacremoses as sm
E ModuleNotFoundError: No module named 'sacremoses'
.venv/lib/python3.10/site-packages/transformers/models/transfo_xl/tokenization_transfo_xl.py:30: ModuleNotFoundError
The above exception was the direct cause of the following exception:
client = <starlette.testclient.TestClient object at 0x15b2a19c0>
def test_time_weight(client):
"Tests whether it favours more recent memories"
memory_response_1 = client.post(
"/remember", json={"type": "DIRECT", "text": "the car is at the back"}
)
assert memory_response_1.status_code == 200
e2e_helper.consume_context_queue()
> with freeze_time(datetime.now() - timedelta(minutes=60), tick=True):
src/e2e/test_questions.py:168:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.venv/lib/python3.10/site-packages/freezegun/api.py:613: in __enter__
return self.start()
.venv/lib/python3.10/site-packages/freezegun/api.py:702: in start
module_attrs = _get_cached_module_attributes(module)
.venv/lib/python3.10/site-packages/freezegun/api.py:129: in _get_cached_module_attributes
_setup_module_cache(module)
.venv/lib/python3.10/site-packages/freezegun/api.py:108: in _setup_module_cache
all_module_attributes = _get_module_attributes(module)
.venv/lib/python3.10/site-packages/freezegun/api.py:97: in _get_module_attributes
attribute_value = getattr(module, attribute_name)
.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:896: in __getattr__
value = getattr(module, name)
.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:895: in __getattr__
module = self._get_module(self._class_to_module[name])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <module 'transformers.models.transfo_xl' from '/Users/rbhalla/dev/second/server/.venv/lib/python3.10/site-packages/transformers/models/transfo_xl/__init__.py'>, module_name = 'tokenization_transfo_xl'
def _get_module(self, module_name: str):
try:
return importlib.import_module("." + module_name, self.__name__)
except Exception as e:
> raise RuntimeError(
f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its"
f" traceback):\n{e}"
) from e
E RuntimeError: Failed to import transformers.models.transfo_xl.tokenization_transfo_xl because of the following error (look up to see its traceback):
E No module named 'sacremoses'
.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:907: RuntimeError
The libraries that seem to be involved, transfo_xl and sacremoses are not being referenced anywhere in my code base. It's possible another dependency is, but again, these errors are not present unless freeze_time is called, making me wonder why this is happening.
It's possible I could just install sacremoses, but given it's not a dependency of my project, I'd rather solve the root problem here.
Anyone see something obvious I am missing? Or is anyone able to recommend a library that achieves the same functionality as freeze_time?
Edit, it seems like installing pandas modifies the error slightly:
self = <module 'transformers.models.tapas' from '/Users/rbhalla/dev/second/server/.venv/lib/python3.10/site-packages/transformers/models/tapas/__init__.py'>
module_name = 'tokenization_tapas'
def _get_module(self, module_name: str):
try:
> return importlib.import_module("." + module_name, self.__name__)
.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:905:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
name = '.tokenization_tapas', package = 'transformers.models.tapas'
def import_module(name, package=None):
"""Import a module.
The 'package' argument is required when performing a relative import. It
specifies the package to use as the anchor point from which to resolve the
relative import to an absolute import.
"""
level = 0
if name.startswith('.'):
if not package:
msg = ("the 'package' argument is required to perform a relative "
"import for {!r}")
raise TypeError(msg.format(name))
for character in name:
if character != '.':
break
level += 1
> return _bootstrap._gcd_import(name[level:], package, level)
../../../.pyenv/versions/3.10.3/lib/python3.10/importlib/__init__.py:126:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
name = 'transformers.models.tapas.tokenization_tapas', package = 'transformers.models.tapas', level = 1
> ???
<frozen importlib._bootstrap>:1050:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
name = 'transformers.models.tapas.tokenization_tapas', import_ = <function _gcd_import at 0x10b4c3400>
> ???
<frozen importlib._bootstrap>:1027:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
name = 'transformers.models.tapas.tokenization_tapas', import_ = <function _gcd_import at 0x10b4c3400>
> ???
<frozen importlib._bootstrap>:1006:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
spec = ModuleSpec(name='transformers.models.tapas.tokenization_tapas', loader=<_frozen_importlib_external.SourceFileLoader ob...='/Users/rbhalla/dev/second/server/.venv/lib/python3.10/site-packages/transformers/models/tapas/tokenization_tapas.py')
> ???
<frozen importlib._bootstrap>:688:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <_frozen_importlib_external.SourceFileLoader object at 0x15a96f850>
module = <module 'transformers.models.tapas.tokenization_tapas' from '/Users/rbhalla/dev/second/server/.venv/lib/python3.10/site-packages/transformers/models/tapas/tokenization_tapas.py'>
> ???
<frozen importlib._bootstrap_external>:883:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
f = <built-in function exec>
args = (<code object <module> at 0x15a9ac3a0, file "/Users/rbhalla/dev/second/server/.venv/lib/python3.10/site-packages/trans... `'pt'`: Return PyTorch `torch.Tensor` objects.\n - `'np'`: Return Numpy `np.ndarray` objects.\n", ...})
kwds = {}
> ???
<frozen importlib._bootstrap>:241:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
# coding=utf-8
# Copyright 2020 Google Research and The HuggingFace Inc. team.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
""" Tokenization class for TAPAS model."""
import collections
import datetime
import enum
import itertools
import math
import os
import re
import unicodedata
from dataclasses import dataclass
from typing import Callable, Dict, Generator, List, Optional, Text, Tuple, Union
import numpy as np
from ...tokenization_utils import PreTrainedTokenizer, _is_control, _is_punctuation, _is_whitespace
from ...tokenization_utils_base import (
ENCODE_KWARGS_DOCSTRING,
BatchEncoding,
EncodedInput,
PreTokenizedInput,
TextInput,
)
from ...utils import ExplicitEnum, PaddingStrategy, TensorType, add_end_docstrings, is_pandas_available, logging
if is_pandas_available():
> import pandas as pd
.venv/lib/python3.10/site-packages/transformers/models/tapas/tokenization_tapas.py:43:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
# flake8: noqa
__docformat__ = "restructuredtext"
# Let users know if they're missing any of our hard dependencies
hard_dependencies = ("numpy", "pytz", "dateutil")
missing_dependencies = []
for dependency in hard_dependencies:
try:
__import__(dependency)
except ImportError as e:
missing_dependencies.append(f"{dependency}: {e}")
if missing_dependencies:
raise ImportError(
"Unable to import required dependencies:\n" + "\n".join(missing_dependencies)
)
del hard_dependencies, dependency, missing_dependencies
# numpy compat
> from pandas.compat import is_numpy_dev as _is_numpy_dev
.venv/lib/python3.10/site-packages/pandas/__init__.py:22:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
"""
compat
======
Cross-compatible functions for different versions of Python.
Other items:
* platform checker
"""
import os
import platform
import sys
from pandas._typing import F
> from pandas.compat.numpy import (
is_numpy_dev,
np_version_under1p19,
np_version_under1p20,
)
.venv/lib/python3.10/site-packages/pandas/compat/__init__.py:15:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
""" support numpy compatibility across versions """
import numpy as np
> from pandas.util.version import Version
.venv/lib/python3.10/site-packages/pandas/compat/numpy/__init__.py:4:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> from pandas.util._decorators import ( # noqa:F401
Appender,
Substitution,
cache_readonly,
)
.venv/lib/python3.10/site-packages/pandas/util/__init__.py:1:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
from __future__ import annotations
from functools import wraps
import inspect
from textwrap import dedent
from typing import (
Any,
Callable,
Mapping,
cast,
)
import warnings
> from pandas._libs.properties import cache_readonly # noqa:F401
.venv/lib/python3.10/site-packages/pandas/util/_decorators.py:14:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
__all__ = [
"NaT",
"NaTType",
"OutOfBoundsDatetime",
"Period",
"Timedelta",
"Timestamp",
"iNaT",
"Interval",
]
> from pandas._libs.interval import Interval
.venv/lib/python3.10/site-packages/pandas/_libs/__init__.py:13:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
pandas/_libs/interval.pyx:1:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
pandas/_libs/hashtable.pyx:1:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
pandas/_libs/missing.pyx:1:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
__all__ = [
"dtypes",
"localize_pydatetime",
"NaT",
"NaTType",
"iNaT",
"nat_strings",
"OutOfBoundsDatetime",
"OutOfBoundsTimedelta",
"IncompatibleFrequency",
"Period",
"Resolution",
"Timedelta",
"normalize_i8_timestamps",
"is_date_array_normalized",
"dt64arr_to_periodarr",
"delta_to_nanoseconds",
"ints_to_pydatetime",
"ints_to_pytimedelta",
"get_resolution",
"Timestamp",
"tz_convert_from_utc_single",
"to_offset",
"Tick",
"BaseOffset",
"tz_compare",
]
from pandas._libs.tslibs import dtypes
> from pandas._libs.tslibs.conversion import (
OutOfBoundsTimedelta,
localize_pydatetime,
)
.venv/lib/python3.10/site-packages/pandas/_libs/tslibs/__init__.py:30:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
pandas/_libs/tslibs/conversion.pyx:1:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E TypeError: type 'pandas._libs.tslibs.base.ABCTimestamp' is not dynamically allocated but its base type 'FakeDatetime' is dynamically allocated
pandas/_libs/tslibs/base.pyx:1: TypeError
The above exception was the direct cause of the following exception:
client = <starlette.testclient.TestClient object at 0x159946f20>
def test_time_weight(client):
"Tests whether it favours more recent memories"
memory_response_1 = client.post(
"/remember", json={"type": "DIRECT", "text": "the car is at the back"}
)
assert memory_response_1.status_code == 200
e2e_helper.consume_context_queue()
> with freeze_time(datetime.now() - timedelta(minutes=60), tick=True):
src/e2e/test_questions.py:168:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.venv/lib/python3.10/site-packages/freezegun/api.py:633: in __enter__
return self.start()
.venv/lib/python3.10/site-packages/freezegun/api.py:722: in start
module_attrs = _get_cached_module_attributes(module)
.venv/lib/python3.10/site-packages/freezegun/api.py:129: in _get_cached_module_attributes
_setup_module_cache(module)
.venv/lib/python3.10/site-packages/freezegun/api.py:108: in _setup_module_cache
all_module_attributes = _get_module_attributes(module)
.venv/lib/python3.10/site-packages/freezegun/api.py:97: in _get_module_attributes
attribute_value = getattr(module, attribute_name)
.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:896: in __getattr__
value = getattr(module, name)
.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:895: in __getattr__
module = self._get_module(self._class_to_module[name])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <module 'transformers.models.tapas' from '/Users/rbhalla/dev/second/server/.venv/lib/python3.10/site-packages/transformers/models/tapas/__init__.py'>
module_name = 'tokenization_tapas'
def _get_module(self, module_name: str):
try:
return importlib.import_module("." + module_name, self.__name__)
except Exception as e:
> raise RuntimeError(
f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its"
f" traceback):\n{e}"
) from e
E RuntimeError: Failed to import transformers.models.tapas.tokenization_tapas because of the following error (look up to see its traceback):
E type 'pandas._libs.tslibs.base.ABCTimestamp' is not dynamically allocated but its base type 'FakeDatetime' is dynamically allocated
.venv/lib/python3.10/site-packages/transformers/utils/import_utils.py:907: RuntimeError
This is what I needed to add to fix this problem:
freezegun.configure(extend_ignore_list=['transformers'])
If anyone could explain why this was needed, I would appreciate it.

Returning List[Double], Map[String, Double] from a list of Doubles

I'm trying to return a tuple of a list and map. I've already gotten the list to compile correctly however I'm not sure how to use a map and list to get a map of the keys and values that match what's in my list. Here is what I have so far:
I've been able to achieve returning a list. However I need to return (List[Double], Map[String, Double])
def emulateSingleInstruction(stack: List[Double],
env: Map[String, Double],
ins: StackMachineInstruction): (List[Double], Map[String, Double]) = {
ins match{
case AddI => stack match{
case i1 :: i2 :: tail => (i1 + i2 :: tail, env)
case _ => throw new IllegalArgumentException()
}
//USE V1 and V2
case SubI => stack match{
case i1 :: i2 :: tail => (i1 - i2 :: tail, env)
case _ => throw new IllegalArgumentException()
}
case MultI => stack match{
case i1 :: i2 :: tail => (i1 * i2 :: tail, env)
case _ => throw new IllegalArgumentException()
}
//USE V1 and V2
case DivI => stack match{
case i1 :: i2 :: tail => (i1 / i2 :: tail, env)
case _ => throw new IllegalArgumentException()
}
case ExpI => stack match{
case Nil => throw new IllegalArgumentException()
case head :: tail => {
(scala.math.exp(head) :: tail,env)
}
}
case LogI => stack match{
case Nil => throw new IllegalArgumentException()
case head :: tail => {
if (head > 0){(scala.math.log(head) :: tail,env)}
else{ throw new IllegalArgumentException()}
}
}
case SinI => stack match{
case Nil => throw new IllegalArgumentException()
case head :: tail => {
(scala.math.sin(head) :: tail,env)
}
}
case CosI => stack match{
case Nil => throw new IllegalArgumentException()
case head :: tail => {
(scala.math.cos(head) :: tail,env)
}
}
case PushI(f) => (f :: stack,env)
case PopI => stack match{
case Nil => throw new IllegalArgumentException()
case i1 :: tail => {
(tail,env)
}
}
}
}
Since your example operation seems not to modify the environment but only the stack, I understand you are simply asking how to combine the new stack and the environment in the return value.
def emulateSingleInstruction(stack: List[Double],
env: Map[String, Double],
ins: StackMachineInstruction): (List[Double], Map[String, Double]) = {
val newStack = ins match {
case AddI => // your code. Looks good...
}
val newEnv = // expression that evaluates to the updated environment
(newStack, newEnv)
}

Parsing parentheses on search expression - Scala parser combinator

I am writing a parser for a search expression.
Ex.
a = "zyx" and ( b < 5 or c > 9)
I wrote this parser but it's not able to match the parentheses, getting this error:
failure: identifier expected
a = "zyx" and ( b < 5 or c > 9)
^
What can I do to be able to match the paretheses
class SearchQueryParser extends StandardTokenParsers {
def expr: Parser[Expression] = orExp | "(" ~> orExp ~ ")"
def orExp: Parser[Expression] = {
andExp *("or" ^^^ {(a: Expression, b: Expression) => BoolExp("OR", (a, b))})
}
def andExp: Parser[Expression] = {
compareExp *("and" ^^^ {(a: Expression, b: Expression) => BoolExp("AND", (a, b))})
}
def compareExp: Parser[Expression] = {
identifier ~ rep(
("=" | "!=" | "<" | ">") ~ literal ^^ {
case op ~ rhs => (op, rhs)
}
) ^^ {
case (acc, ("=", rhs: Expression)) => Binomial("=", acc, rhs)
case (acc, ("!=", rhs: Expression)) => Binomial("!=", acc, rhs)
case (acc, ("<", rhs: Expression)) => Binomial("<", acc, rhs)
case (acc, (">", rhs: Expression)) => Binomial(">", acc, rhs)
}
}
}
Your current grammar only allows parentheses in the expr rule, which I assume is your main rule, and the expr rule is never used by any other rule. So parentheses are only allowed around the entire expression.
What you want to do is to put "(" ~ expr ~ ")" in the lowest place where parenthesized expressions would be allowed. As I understand your grammar, that would probably meaning allowing it as an alternative to compareExp in andExp (assuming you don't want to allow parentheses inside compareExps).
As #sepp2k mentioned you need to put your parentheses on the lowest place where parentheses is allowed. In your case it should be in the compareExp:
def compareExp: Parser[Expression] = {
"(" ~> expr <~ ")" | identifier ~ rep(
("=" | "!=" | "<" | ">") ~ literal ^^ {
case op ~ rhs => (op, rhs)
}
) ^^ {
case (acc, ("=", rhs: Expression)) => Binomial("=", acc, rhs)
case (acc, ("!=", rhs: Expression)) => Binomial("!=", acc, rhs)
case (acc, ("<", rhs: Expression)) => Binomial("<", acc, rhs)
case (acc, (">", rhs: Expression)) => Binomial(">", acc, rhs)
}
}
and exp method should not handle the parentheses
def expr: Parser[Expression] = orExp

MatchError in Scala parser combinator to parse query string

I am creating a parser to parse a query string like this
e = "500.3" AND dt = "20190710" AND s in ("ERROR", "WARN") OR cat = "Conditional"
For the previous string I get the following:
Exception in thread "main" scala.MatchError: (Identifier(e),(=,StringLiteral(500.3))) (of class scala.Tuple2)
I am assuming that my grammar is fine (maybe not). Maybe someone can help find out why I am getting this error. Here is my parser.
class SearchQueryParser extends StandardTokenParsers {
lexical.reserved += ("OR", "AND")
lexical.delimiters += ( "<", "=", "<>", "!=", "<=", ">=", ">", "(", ")")
def expr: Parser[QueryExp] = orExp
def orExp: Parser[QueryExp] = andExp *("OR" ^^^ {(a: QueryExp, b: QueryExp) => BoolExp("OR", (a, b))})
def andExp: Parser[QueryExp] = compareExp *("AND" ^^^ {(a: QueryExp, b: QueryExp) => BoolExp("AND", (a, b))})
def compareExp: Parser[QueryExp] = {
identifier ~ rep(("=" | "<>" | "!=" | "<" | "<=" | ">" | ">=") ~ literal ^^ {
case op ~ rhs => (op, rhs)
}) ^^ {
case lhs ~ elems =>
elems.foldLeft(lhs) {
case (id, ("=", rhs: String)) => Binomial("=", id.str, rhs)
case (id, ("<>", rhs: String)) => Binomial("!=", id.str, rhs)
case (id, ("!=", rhs: String)) => Binomial("!=", id.str, rhs)
case (id, ("<", rhs: String)) => Binomial("<", id.str, rhs)
case (id, ("<=", rhs: String)) => Binomial("<=", id.str, rhs)
case (id, (">", rhs: String)) => Binomial(">", id.str, rhs)
case (id, (">=", rhs: String)) => Binomial(">=", id.str, rhs)
}
}
}
def literal: Parser[QueryExp] = stringLit ^^ (s => StringLiteral(s))
def identifier: Parser[QueryExp] = ident ^^ (s => Identifier(s))
def parse(queryStr: String): Option[QueryExp] = {
phrase(expr)(new lexical.Scanner(queryStr)) match {
case Success(r, _) => Option(r)
case x => println(x); None
}
}
}
I was able to find the issue. It seems like the error was produced because the partial function in the foldLeft(lhs) statement wasn't matching the tuple (=,StringLiteral(500.3))
As you can see, in every case statement of the partial function, I am trying to match a rhs of type String
...
case (id, ("=", rhs: String)) => Binomial("=", id.str, rhs)
case (id, ("<>", rhs: String)) => Binomial("!=", id.str, rhs)
case (id, ("!=", rhs: String)) => Binomial("!=", id.str, rhs)
...
However, as you can see in the error Exception in thread "main" scala.MatchError: (Identifier(e),(=,StringLiteral(500.3))) (of class scala.Tuple2) the input was parsed as a tuple of "=" and StringLiteral.
The solution was to change the type of the rhs parameter:
...
case (id, ("=", rhs: StringLiteral)) => Binomial("=", id.str, rhs.toString)
case (id, ("<>", rhs: StringLiteral)) => Binomial("!=", id.str, rhs.toString)
case (id, ("!=", rhs: StringLiteral)) => Binomial("!=", id.str, rhs.toString)
...

Non-exhaustive Pattern Match Warning with Tuple

Given the following sealed trait:
scala> sealed trait Parent
defined trait Parent
scala> case object Boy extends Parent
defined object Boy
scala> case object Girl extends Parent
defined object Girl
And given xs:
scala> val xs: (Parent, (Seq[Int], Seq[Int])) = (Boy, (Nil, Nil))
xs: (Parent, (Seq[Int], Seq[Int])) = (Boy,(List(),List()))
scala> xs match {
| case (Boy, (Nil, Nil)) => 1
| case (Boy, (ys, a :: as)) => 2
| case (Boy, (ys, Nil)) => 3
| case (Girl, (Nil, Nil)) => 4
| case (Girl, (a :: as, ys)) => 5
| case (Girl, (Nil, ys)) => 6
| }
<console>:15: warning: match may not be exhaustive.
It would fail on the following inputs: (Boy, _), (Girl, _)
xs match {
^
res1: Int = 1
I don't understand this inexhaustive match warning. What do (Boy, _) and (Girl, _) mean?
I'm not sure how the (Seq[Int], Seq[Int]) could have any other match than what I have on the right-hand side.
I'm not sure why, but it seems Scala has trouble with the unapply (or is it apply) for Seq which is the magic that makes the pattern matching possible. Apparently the cons in your matches is an issue with Seq. If you changes your xs to use List, it would work:
val xs: (Parent, (List[Int], List[Int])) = (Boy, (Nil, Nil))
EDIT:
It may be that you can use Seq but just have to pattern match on the construction slightly differently, like this:
xs match {
case (Boy, (Nil, Nil)) => 1
case (Boy, (_, Seq(x, _))) => 2
case (Boy, (_, Nil)) => 3
case (Girl, (Nil, Nil)) => 4
case (Girl, (Seq(x, _), _)) => 5
case (Girl, (Nil, _)) => 6
}