Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Couldn't run sql: 'Connection' object has no attribute 'cursor' #59293

Closed
3 tasks done
Jonathan-Wei opened this issue Jul 21, 2024 · 5 comments
Closed
3 tasks done
Labels
Bug IO SQL to_sql, read_sql, read_sql_query Needs Info Clarification about behavior needed to assess issue

Comments

@Jonathan-Wei
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

from sqlalchemy import create_engine
from sqlalchemy.schema import Table, MetaData
from sqlalchemy.sql.expression import select, text
engine = create_engine('starrocks://root:[email protected]:9030/default_catalog.ssb?charset=utf8')
connection = engine.connect()

print('\n'.join(['%s:%s' % item for item in connection.__dict__.items()]))

# You define a function that takes in a SQL query as a string and returns a pandas dataframe
def run_sql(sql: str) -> pd.DataFrame:
    print("query sql is :"+sql)
    df = pd.read_sql_query(sql, connection)
    return df

Issue Description

image

Expected Behavior

Running SQL normally

Installed Versions

Replace this line with the output of pd.show_versions()

Name: pandas
Version: 2.2.2

Name: SQLAlchemy
Version: 2.0.31

@Jonathan-Wei Jonathan-Wei added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 21, 2024
@rhshadrach
Copy link
Member

Thanks for the report. Can you include a full stack trace for the error being raised. Also, what is the output of type(connection) after connection = engine.connect()?

@rhshadrach rhshadrach added IO SQL to_sql, read_sql, read_sql_query Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 21, 2024
@disarticulate
Copy link

disarticulate commented Jul 28, 2024

Pandas v2.2.2
sqlalchemy 1.4.52

this doesnt work:

from sqlalchemy import create_engine

uri = build_db_conn()
connection = create_engine(uri).connect()
print (uri, type(connection))
pandas.read_sql(sql, con = connection)
postgresql://****:****@localhost:34220/**** <class 'sqlalchemy.engine.base.Connection'>
/tmp/ipykernel_14167/2991965835.py:4: UserWarning: pandas only supports SQLAlchemy connectable (engine/connection) or database string URI or sqlite3 DBAPI2 connection. Other DBAPI2 objects are not tested. Please consider using SQLAlchemy.
  pandas.read_sql(sql, con = connection)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[37], line 4
      2 connection = create_engine(uri).connect()
      3 print (uri, type(connection))
----> 4 pandas.read_sql(sql, con = connection)

File /opt/conda/lib/python3.11/site-packages/pandas/io/sql.py:706, in read_sql(sql, con, index_col, coerce_float, params, parse_dates, columns, chunksize, dtype_backend, dtype)
    704 with pandasSQL_builder(con) as pandas_sql:
    705     if isinstance(pandas_sql, SQLiteDatabase):
--> 706         return pandas_sql.read_query(
    707             sql,
    708             index_col=index_col,
    709             params=params,
    710             coerce_float=coerce_float,
    711             parse_dates=parse_dates,
    712             chunksize=chunksize,
    713             dtype_backend=dtype_backend,
    714             dtype=dtype,
    715         )
    717     try:
    718         _is_table_name = pandas_sql.has_table(sql)

File /opt/conda/lib/python3.11/site-packages/pandas/io/sql.py:2738, in SQLiteDatabase.read_query(self, sql, index_col, coerce_float, parse_dates, params, chunksize, dtype, dtype_backend)
   2727 def read_query(
   2728     self,
   2729     sql,
   (...)
   2736     dtype_backend: DtypeBackend | Literal["numpy"] = "numpy",
   2737 ) -> DataFrame | Iterator[DataFrame]:
-> 2738     cursor = self.execute(sql, params)
   2739     columns = [col_desc[0] for col_desc in cursor.description]
   2741     if chunksize is not None:

File /opt/conda/lib/python3.11/site-packages/pandas/io/sql.py:2672, in SQLiteDatabase.execute(self, sql, params)
   2670     raise TypeError("Query must be a string unless using sqlalchemy.")
   2671 args = [] if params is None else [params]
-> 2672 cur = self.con.cursor()
   2673 try:
   2674     cur.execute(sql, *args)

AttributeError: 'Connection' object has no attribute 'cursor'

This also doesn't work:

 pandas.read_sql(sql, build_db_conn())
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[41], line 1
----> 1 pandas.read_sql(sql, build_db_conn())

File /opt/conda/lib/python3.11/site-packages/pandas/io/sql.py:704, in read_sql(sql, con, index_col, coerce_float, params, parse_dates, columns, chunksize, dtype_backend, dtype)
    701     dtype_backend = "numpy"  # type: ignore[assignment]
    702 assert dtype_backend is not lib.no_default
--> 704 with pandasSQL_builder(con) as pandas_sql:
    705     if isinstance(pandas_sql, SQLiteDatabase):
    706         return pandas_sql.read_query(
    707             sql,
    708             index_col=index_col,
   (...)
    714             dtype=dtype,
    715         )

File /opt/conda/lib/python3.11/site-packages/pandas/io/sql.py:903, in pandasSQL_builder(con, schema, need_transaction)
    900 sqlalchemy = import_optional_dependency("sqlalchemy", errors="ignore")
    902 if isinstance(con, str) and sqlalchemy is None:
--> 903     raise ImportError("Using URI string without sqlalchemy installed.")
    905 if sqlalchemy is not None and isinstance(con, (str, sqlalchemy.engine.Connectable)):
    906     return SQLDatabase(con, schema, need_transaction)

ImportError: Using URI string without sqlalchemy installed.

this works:

pandas.read_sql(sql, con = create_engine(build_db_conn()).connect().connection)

but gives warning:

/tmp/ipykernel_14167/3615795558.py:1: UserWarning: pandas only supports SQLAlchemy connectable (engine/connection) or database string URI or sqlite3 DBAPI2 connection. Other DBAPI2 objects are not tested. Please consider using SQLAlchemy.

Looks like if isinstance(pandas_sql, SQLiteDatabase): is failing to discover postgres?

@asishm
Copy link
Contributor

asishm commented Jul 29, 2024

@disarticulate your issue is unrelated. pandas 2.2+ requires sqlalchemy 2.0+ (see #57049). The OP shows that the version requirements are satisfied.

@MotoMatt5040
Copy link

MotoMatt5040 commented Jul 30, 2024

I think I may be missing some info here, but for starters it may be better to restructure your code.

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('starrocks://root:[email protected]:9030/default_catalog.ssb?charset=utf8')

def run_sql(sql: str) -> pd.DataFrame:
    print("Query SQL is: " + sql)
    with engine.connect() as connection:
        df = pd.read_sql_query(sql, connection)
    return df

this will ensure your connection is opened and closed when you are requesting data.

I wonder if it is an issue with the URI itself or the way you are passing the sql string as it works for me when I use it with my own database.

@mroeschke
Copy link
Member

Looks like some additional information is needed to help diagnose. Closing but feel free to open a new issue if you can provide that info

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO SQL to_sql, read_sql, read_sql_query Needs Info Clarification about behavior needed to assess issue
Projects
None yet
Development

No branches or pull requests

6 participants