Databases the SQL [see-kwuhl]

Subscribe: Apple Podcasts | Spotify | TuneIn | RSS

Welcome back for part 2 of the podcast about databases. In this half, we discuss several of the things we believe that developers should know about databases. From joins to unions, group by’s and indexing, we try to touch on a lot of the items that most developers should at least be familiar with when working with database systems.

News

Databases the SQL

Attended Build Guild Atlanta, a social gathering of developers who were talking shop in a relaxed, casual atmosphere.
http://atlanta.buildguild.org/
Met at a place called the Joystick Gamebar – apparently has some insanely good french fries:
http://joystickgamebar.com/
dotPeek decompiler
http://www.jetbrains.com/decompiler/
Decompile AND Debug code that’s not even your own!
Make sure if you’re hosting in “The Cloud”, ensure that you’ve secured and backed up your assets as best as you can.
- Use multi-factor authentication if possible
- Use strong passwords
- Backup (maybe outside the cloud)
  http://www.codespaces.com/
OWASP Meetup – OWASP Mobile Vulnerabilities
http://www.meetup.com/OWASP-Atlanta/
Checkout the Security Now Podcast
https://www.grc.com/securitynow.htm
Another 5 star review in iTunes!!! Be the next one!
Huge thanks to Skinner_MW
http://www.codingblocks.net/iTunes
Some fantastic feedback from Jim Basilio on Java and Springboot in response to Episode 12
http://www.codingblocks.net/podcast/episode-12-what-do-you-want-to-be-when-you-grow-up-define-me/#comment-1492256782
Tron Anderson left an excellent comment in Episode 13 regarding various ways to do the recursive queries. Must read for people looking how to query parent-child tables effectively
http://www.codingblocks.net/podcast/all-your-database-are-belong-to-us/#comment-1492256938

Database Basics, and Maybe a TOUCH of Advanced Stuff

CROSS JOIN – cartesian product of two tables – every row in table 1 matched up with every row in table 2
- Careful! Doing this on large tables could crash your server!
INNER JOIN – where the only rows you get back is when the data in table 1 matches the data in table 2 on the join conditions
Outer Joins – LEFT OUTER, RIGHT OUTER, FULL OUTER
- LEFT OUTER will return all records from the table on the left side of the join and any data that matches in the right table, otherwise the data in the right table will be nulled
- RIGHT OUTER will return all records from the table on the right side of the join and any data that matches in the left table, otherwise the data in the left table will be nulled
- FULL OUTER will return all data from both tables with the data that’s common between the two tables fully filled in, otherwise, the data that’s missing from each side will be nulled
Database Normalization
http://en.wikipedia.org/wiki/Database_normalization
Checkout @SqlKris on Twitter – runs a database blog on learning SQL and very helpful in responding to questions on Twitter
https://twitter.com/sqlkris
Refactoring databases can be very difficult – usually means refactoring a lot of application code, not to mention any stored procedures, views, etc that may live in the database
Outlaw is still 21….
Do you put your data interactions in a stored procedure or do you put that code in an application?
- Pros would be that you’ve centralized your database “logic”
- Where this doesn’t work – if you need data from other systems and using linked servers is not an option
You can join tables across databases (at least in SQL Server)
Cardinality – one to one or one to many
To subtype or not to subtype a table?
- If you decide to do this, you could have hundreds of tables and managing this through your application could be a major pain…but, the performance would be outstanding
- If you don’t do subtypes but you do the EAV route (Entity Attribute Value schema), it’s easier to maintain but query performance wouldn’t be as good as the subtyping
  http://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model
Set Operators
- UNION – appends two recordsets together (and throws out duplicates)
- UNION ALL – appends two recordsets together and keeps the duplicates
- EXCEPT – returns all the rows in the first recordset unless it’s in the second recordset
- INTERSECT – returns all the rows that are common between the first recordset and the second recordset (similar to doing an INNER JOIN on every column being returned from the two tables being used)
Check out SQL Authority
http://www.sqlauthority.com
Aggregating Data
- Difference between a HAVING and a WHERE clause? – Interview question asked in every developer interview known to man! 🙂
- GROUP BY – used to “group” or aggregate data based off the provided columns
  - Have to use a GROUP BY when doing an AVG (average) or a SUM or a MAX, MIN, etc.
  - Why no GROUP BY *????
- DISTINCT or GROUP BY – can do similar things if you’re trying to remove duplicate values
- COUNT(DISTINCT…)
Row numbers – think paging – you want to get records between 100 and 120
- Oracle – rownum
- SQL Server 2005 and up – ROW_NUMBER()
- mySQL – start drinking heavily
Windowed Functions in SQL Server – GLORIOUS
- RANK
- DENSE RANK
- NTILE
- ROW_NUMBER
  http://msdn.microsoft.com/en-us/library/ms189798.aspx
Is char….”char” as in you burnt our burgers, or is it “car” as in you drive it – PLEASE, leave your comment below!!!
nvarchar vs varchar – if you will EVER need to store UNICODE (international characters, etc.), then go nvarchar…if not, save the space and use varchar
To Guid or not to Guid?! Why they suck as a primary key on your table (for performance)
Parameterized queries – USE THEM!
OWASP in Episode 4
https://www.owasp.org/index.php/Query_Parameterization_Cheat_Sheet
What about SQL Developers who want to program?
- PHP
- Perl (similar to what database guys do with scrubbing data)
- Javascript – simple language to learn out of the box – extremely powerful with things like NodeJS

Performance in Databases

Indexes
- Clustered Indexes – stores the data sorted in the table (makes your table a clustered table)
- Non-clustered indexes – stored outside the table but points back to the records in the main table storage
- Can index temp tables! Sometimes necessary
- SQL Server 2008 (and up) – Filtered Indexes
- Creating a ton of indexes is not always the right solution!
- Understanding fill factors – leaving space for wiggle room on an index
- CAN be a performance bottleneck on inserts / updates