Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: UPDATE with JOINs with all the same vindexes does not need OperatorType DMLWithInput #16876

Open
wiebeytec opened this issue Oct 2, 2024 · 0 comments

Comments

@wiebeytec
Copy link

Overview of the Issue

Vitess recently got UPDATE with JOIN support, but there may be some inefficiency/deficiency. It seems to resort to OperatorType DMLWithInput (which rewrites to SELECT ... FOR UPDATE; UPDATE ... ) when it doesn't have to. This is not only inefficient, but also, when does that rewrite, it causes deadlocks if you have multiple connections doing it an the same time (in transactions). I ran into this.

Expected result: when using fields from joined tables from the same shard/vindex in the SET expression, it can still use OperatorType Update.

Reproduction Steps

As an example that is treated well: I use hard-coded values in SET, it uses a normal update:

vexplain UPDATE
alarmLog AS al
INNER JOIN lastLogData as lld
ON al.idSite = lld.idSite
SET al.valueEnum = 2
WHERE lld.valueEnum != 1
AND al.idSite = 98478 \G
{                                                                                       
        "OperatorType": "Update",                                                             
        "Variant": "EqualUnique",                                                             
        "Keyspace": {                                                                         
                "Name": "sites2024",                                                                                  
                "Sharded": true                                                                                                                                                             
        },                                                                                                                                                                                  
        "TargetTabletType": "PRIMARY",                                                                                                                                                                                                      
        "Query": "update alarmLog as al, lastLogData as lld set al.valueEnum = 2 where al.idSite = 98478 and lld.valueEnum != 1 and al.idSite = lld.idSite",                                                                                
        "Table": "alarmLog",                                                                                          
        "Values": [                    
                "98478"                
        ],                                                                                    
        "Vindex": "a_standard_hash"                                                           
}

But when I take the value to set from the JOIN, it uses DMLWithInput:

vexplain UPDATE
alarmLog AS al
INNER JOIN lastLogData as lld
ON al.idSite = lld.idSite
SET al.valueEnum = lld.valueEnum
WHERE lld.valueEnum != 1
AND al.idSite = 98478 
\G
{
        "OperatorType": "DMLWithInput",
        "TargetTabletType": "PRIMARY",
        "BindVars": [
                "0:[lld_valueEnum:1]"
        ],
        "Offset": [
                "0:[0]"
        ],
        "Inputs": [
                {
                        "OperatorType": "Route",
                        "Variant": "EqualUnique",
                        "Keyspace": {
                                "Name": "sites2024",
                                "Sharded": true
                        },
                        "FieldQuery": "select al.idAlarm, lld.valueEnum from alarmLog as al, lastLogData as lld where 1 != 1",
                        "Query": "select al.idAlarm, lld.valueEnum from alarmLog as al, lastLogData as lld where al.idSite = 98478 and lld.valueEnum != 1 and al.idSite = lld.idSite for update",
                        "Table": "alarmLog, lastLogData",
                        "Values": [
                                "98478"
                        ],
                        "Vindex": "a_standard_hash"
                },
                {
                        "OperatorType": "Update",
                        "Variant": "Scatter",
                        "Keyspace": {
                                "Name": "sites2024",
                                "Sharded": true
                        },
                        "TargetTabletType": "PRIMARY",
                        "Query": "update alarmLog as al set al.valueEnum = :lld_valueEnum where al.idAlarm in ::dml_vals",
                        "Table": "alarmLog"
                }
        ]
}

The relevant table entries in the VSchema:

"alarmLog": {         
      "column_vindexes": [
        {                    
          "column": "idSite",      
          "name": "a_standard_hash"
        }
      ],
      "auto_increment" : {
        "column": "idAlarm",
        "sequence": "alarmLog_seq"                                                                                    
      }
    },

And:

"lastLogData": {              
      "column_vindexes": [
        {
          "column": "idSite",
          "name": "a_standard_hash"
        }
      ]                      
    },

Binary Version

vtgate version Version: 20.0.2 (Git revision 2592c5932b3036647868299b6df76f8ef28dfbc8 branch 'HEAD') built on Wed Sep 11 08:15:20 UTC 2024 by runner@fv-az1152-369 using go1.22.7 linux/amd64

Operating System and Environment details

cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=24.04
DISTRIB_CODENAME=noble
DISTRIB_DESCRIPTION="Ubuntu 24.04.1 LTS"

Log Fragments

No response

@wiebeytec wiebeytec added Needs Triage This issue needs to be correctly labelled and triaged Type: Bug labels Oct 2, 2024
@systay systay added Component: Query Serving and removed Needs Triage This issue needs to be correctly labelled and triaged labels Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants