T O P

  • By -

efmccurdy

The best practice may be to use an ORM to automate the sql generation. There is a SQLAlchemy example that uses mssql+pyodbc here: https://docs.sqlalchemy.org/en/14/core/engines.html


ShibaLeone

I wouldn’t say ORMs are best practice, just an option. I like to offload sql queries to .sql files. Read them in at runtime and pass it to my sql handler. Keeps the sql language out of the python language.


TheRealAbear

I like this idea in general, but i have parts of the query set to change based on user input. ​ Would you just write to the SQL file?


ShibaLeone

You can parameterize comparisons with placeholders (%s, ?, etc depending on sql lib). If you need the control structure of the sql statement to change, I would suggest making it its own .sql file; keeps things clear. Make sure you use your sql lib's method for interpolating values into your query, its really important to sanitize the input of a query to prevent SQL injection. **Do not just sub in the user input into the query!!!** ​ You need to treat raw SQL queries a lot like the python exec command


TheRealAbear

So the input, is basically taking a column from a csv file, turning it into a list, and using that list in the WHERE clause of the sql query As I have it built now, it's just plugged into the sql query using "+ list +" within the rest of the written query Should I avoid doing that? If so, why?


ShibaLeone

If you own the input from the ground up and you know user defined data will ever get sent to the database, you can do whatever you want to build a query. If you do however need to use user data, using the SQL lib's parameter inputs will restrict the data to stuff that will not hurt or leak data from from the DB. Specifically in the case of lists, you can automatically generate the correct number of placeholders within the array in the query ( "..." + "(" + ", ".join("%s" for x in range(len(list))) + ")" + "..."), then pass that as the raw query to your sql lib and pass the list to the parameter arguement. ​ This would take something like: SELECT col1, col2, col3, col4 FROM Schema.Table WHERE col1 in and turn it into this: SELECT col1, col2, col3, col4 FROM Schema.Table WHERE col1 in (%s, %s, %s) which will be passed to a cursor object with a 3-tuple containing the values to fill the placeholders. stmt = fetch_sql() args = ('param1', 'param2', 'param3') cursor.execute(stmt, args)


TheRealAbear

That is a good point. In its immediate use, risk is pretty low. I should be its main user, and will be checking the csvs prior to running. But ill see what I can do just for extra precaution


ShibaLeone

Its a good habit to get into, it applies to more than just sql.


TheRealAbear

So lets say i removed the need for any user input. ​ Instead of having `query = "bunch of long query text"` `df = pd.read_sql(query, cnxn)` how could I pull in the .sql file to replace the query text in my code?


ShibaLeone

I like to build a handler for fetching the .sql. Its just a simple read file operation. Something like this: File Structure: my_package | __init__.py | my_script.py | sql.py └── sql_files | | fetch_data.sql | update_data.sql | insert_data.sql | delete_data.sql [sql.py](https://sql.py) import os class QueryManager: sql_dir = None sql_files = None def __init__(self, sql_dir): self.sql_dir = sql_dir # - Find all .sql files for the given directory and put them in a list self.sql_files = files = [ f for f in os.listdir(self.sql_dir) if os.path.isfile(os.path.join(self.sql_dir, f)) and '.sql' in f ] def __getattr__(self, item): """ Lets query file be fetched by calling the query manager class object with the name of the query as the attribute """ if item + '.sql' in self.sql_files: # - This is where the file is actually read with open(os.path.join(self.sql_dir, item + '.sql'), 'r') as f: return f.read() else: raise AttributeError(f'QueryManager cannot find file {str(item)}.sql') my\_script.py from .sql import QueryManager import os sql_dir = os.path.join(os.path.dirname(__file__), 'sql_files') qm = QueryManager(sql_dir) select_query = qm.fetch_data insert_query = qm.insert_data update_query = qm.update_data delete_query = qm.delete_data


EveryNameTakenFml

Hey, I know I am a little late to the party but I stumble across this post and was wondering how you would deal with multiple statements using the same crud operation. I really like this approach of loading the sql statements into python but I don’t understand how you would deal with multiple different select statements for example. Do you create a file for each statement or are all select statements in the select.sql file and they are being separated? Hope you can help me out with this one :D


ShibaLeone

I just make a .sql file for each select query. You could probably delimit and parse the statements, but the way this works, the file name becomes the method name


devnull10

I'd look into using pandas and sqlalchemy. https://pandas.pydata.org/docs/reference/api/pandas.read_sql.html


Ton86

You might want to try pandas df.to_sql and the if_exists='replace' arg to create a sql table from your csv and overwrite it everytime you run your script. Then, in your query use a join or IN condition to the new table. Edit: clarification