OpenCodePapers

visual-question-answering-on-gqa-test2019

Visual Question Answering (VQA)
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeAccuracyBinaryOpenConsistencyPlausibilityValidityDistributionModelNameReleaseDate
[]()89.391.287.498.497.298.90.0human
[]()76.0484.4668.691.4783.7596.423.68DREAM+Unicoder-VL (MSRA)
[]()74.0382.1266.8989.083.5896.761.29TRRNet (Ensemble)
[]()73.8180.867.6491.7683.996.731.7MIL-nbgao
[]()73.3379.6867.7377.0283.796.362.46Kakao Brain
[]()72.1481.1664.1990.9684.8196.772.39Coarse-to-Fine Reasoning, Single Model
[]()70.2377.563.8286.9483.7796.651.49270
[]()67.5580.4556.1693.8384.1696.532.78NSM ensemble (updated)
[]()64.9282.6349.2994.3784.9196.645.11VinVL-DPT
VinVL+L: Enriching Visual Representation with Location Context in VQA✓ Link64.8582.5949.1994.084.9196.624.59VinVL+L2023-02-22
VinVL: Revisiting Visual Representations in Vision-Language Models✓ Link64.6582.6348.7794.3584.9896.624.72Single Model2021-01-02
[]()63.9480.8449.0391.5484.7496.564.69Wayne
[]()63.277.9150.2289.8485.1596.475.25Single
[]()63.1778.9449.2593.2584.2896.413.71NSM single (updated)
LXMERT: Learning Cross-Modality Encoder Representations from Transformers✓ Link62.7179.7947.6493.185.2196.366.42LXR955, Ensemble2019-08-20
[]()62.4580.9146.1593.9584.1596.335.36MDETR
[]()62.4480.2846.6994.3684.9196.465.331-gqa
[]()61.4978.446.5688.6884.8596.335.7UCM
Bilinear Graph Networks for Visual Question Answering61.2278.6945.8190.3185.4396.366.77GRN2019-07-23
[]()61.1278.0746.1691.1384.896.365.55lxmert-adv-txt
[]()61.177.9946.1991.0884.8296.365.52lxmert-adv-txt
[]()61.0977.8446.388.9285.4996.435.68MSM@MSRA
[]()61.0578.0246.0689.7784.9596.55.24mlmbert
[]()60.9877.3246.5590.7784.9396.385.36fisher
[]()60.9578.4145.5489.0884.2796.354.86ckpt 19 exp 90
[]()60.9377.8346.0190.384.6996.355.7445
[]()60.8978.0745.7393.0284.0596.05.31IQA (single)
[]()60.8779.1244.7692.6185.6396.358.56Ensemble10
[]()60.8378.944.8992.4984.5596.195.54Meta Module, Single
[]()60.777.4145.9689.6584.5596.376.09xpj
[]()60.6778.0245.3689.8184.8496.316.41fbe20v3.json
[]()60.5978.4444.8392.6685.3896.577.28LININ
[]()60.5176.8746.0688.285.1996.158.48prompt IMT-16
[]()60.4277.1245.6889.6984.5696.356.03vv69
[]()60.3777.0945.6189.7784.5696.226.43bert_v1
LXMERT: Learning Cross-Modality Encoder Representations from Transformers✓ Link60.3377.1645.4789.5984.5396.355.69LXR955, Single Model2019-08-20
[]()60.2877.1345.4189.4784.4596.335.38IIE_Morningstar
[]()60.2776.9945.5190.1684.4996.315.39full_nsp_ft_results_submit_predict.json
[]()60.1876.9745.3689.6584.4796.335.29TESTOVQA007
[]()60.1876.8445.4889.7784.696.375.65test gqa
[]()60.1777.1945.1489.6184.4696.365.83Future_Test_team
[]()60.1477.1545.1289.5884.4796.365.81tmp
[]()60.0776.8445.2789.3284.5596.356.21Inspur
[]()60.0276.3745.5990.0584.3496.295.63full_nsp_mlm_ft_joint_results_submit_predict.json
[]()60.0176.7745.2189.1784.4696.356.28SSRP
[]()59.9379.0943.0293.7285.9296.4110.1Musan
[]()59.8476.7944.8989.5284.7296.26.06gaochongyang9
[]()59.8178.0243.7591.4384.7796.56.0PVR
[]()59.876.7444.8589.1484.296.235.11BgTest
[]()59.7277.9743.6189.4384.8996.556.25DAM
[]()59.5477.9843.2689.2184.9496.246.01DL16
[]()59.4377.1143.8289.0584.9496.566.39mcmi
[]()59.3777.5343.3588.6384.7196.186.06rishabh_test
[]()59.2977.3143.3888.9484.4396.35.8UNITER + MAC + Graph Networks
[]()59.1276.6943.688.984.7896.435.6LXMERT-S
[]()59.0676.0744.0489.8182.7693.826.14QGCRGN
[]()58.9176.0843.7589.5284.5296.186.93gbert1
[]()58.8875.0744.5884.6484.8696.235.54glimple_all
[]()58.7276.443.1189.5884.6896.216.58ours-4-gqa_el_tag_v4__pretrain_rel_tag_dist_tc_v7_checkpoint-47-157510-best-4.json
[]()58.4277.3941.6790.2984.5395.577.86Partial-MSP
[]()58.275.9142.5788.2584.7296.085.81UCAS-SARI
[]()58.1276.3942.088.0184.896.065.65stu09e
[]()58.0676.641.790.9685.2796.317.6happyTeam
[]()57.8974.5443.1985.4584.9996.45.73graphRepresentation, Single
[]()57.7975.3742.2688.384.8596.115.65VqaStar-UCAS-SARI
[]()57.7775.7841.8686.8584.9796.445.36REX
[]()57.6575.2242.1487.3584.7396.185.48MLVQA (single)
[]()57.3575.0741.7187.6184.595.865.94rsa-14word
[]()57.2174.4641.9987.684.8796.25.6result_run_2647872_epoch11
[]()57.1475.0741.3187.3684.4995.875.29DeeTee
[]()57.176.040.4191.785.5896.1610.52BAN
[]()57.0773.7742.3384.6884.8196.484.7LCGN
[]()57.0174.7841.3287.7484.2596.036.06RSN (Single Model)
[]()56.9674.9741.0685.1284.8596.387.13GM6_9_2_train
[]()56.9575.0141.0290.4985.4696.379.5wcf-fight
[]()56.9574.6241.3687.7184.5795.985.81total14
[]()56.6573.6541.6484.3584.3795.946.07Testify
[]()56.5973.042.1184.784.8696.44.68F205
[]()56.3874.8440.0991.7183.7695.436.32Feb_ft2_mergeadd_weightalllstm_picklocw_box5_prep
[]()56.2873.7340.8786.8684.296.015.78MMT-VQA
[]()56.1872.8441.4785.4684.0496.185.42IWantADonut
[]()56.1673.5640.884.9984.8396.45.87GIN
[]()56.1172.6541.5285.5184.3696.255.42LOGNet+VLR
[]()56.0973.440.8285.1184.7996.375.14Improved SNMN
[]()56.073.940.287.1684.4596.016.02ST_VQA
[]()55.9371.8141.9383.285.0996.016.05RD
[]()55.772.8840.5383.5284.8196.395.32Deepblue_Semantics
[]()55.6572.8640.4689.1885.2796.339.69LW
[]()55.5772.3940.7483.3284.2496.1510.18RSN (Single Model)_v6
[]()55.4172.8739.9983.0684.7496.355.48nogg
[]()55.3572.6540.0884.1784.5696.325.22abc_test
[]()55.072.0939.9283.4784.6696.345.29KU
[]()54.9471.740.1482.7184.7896.45.1Eden_test
[]()54.7972.4239.2386.184.5595.926.01HDU_ZWF
[]()54.1569.340.7982.3685.1595.995.41vips
[]()54.0671.2338.9181.5984.4896.165.34MAC
[]()53.8972.5237.4487.4785.0596.398.665TMT-qe+o
[]()53.8568.4440.9780.285.1996.285.84ZhaoLab
[]()53.5770.1538.9481.1484.6796.365.32test
[]()53.3170.4138.2380.3384.3295.996.4Sorbonne
[]()52.368.4638.0484.3685.296.212.54UJCNN
[]()52.1969.1537.2278.3483.4495.455.69MJ
[]()52.0267.3538.580.4483.9495.755.64mac_qin
[]()51.8767.9937.6480.284.3596.256.77Mithrandir
[]()51.5167.8237.1179.783.6995.826.16happy
[]()51.2269.3635.282.4483.8296.126.45Space Cat
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering✓ Link49.7466.6434.8378.7184.5796.185.98BottomUp2017-07-25
[]()49.2867.5933.1283.6883.4194.9514.28RAM_BUGGY
[]()49.2766.5734.078.5184.5895.786.91sparsemax15
[]()48.9763.8535.8383.8583.9395.6213.72mfb+bert
[]()48.4465.0233.8181.1985.2996.1517.79RES
[]()47.7266.2831.3484.1684.5295.4519.05LAS
[]()47.3858.7637.3473.7181.7594.556.29test
[]()46.5563.2631.874.5784.2596.027.46LSTM+CNN
[]()45.8664.7429.270.5786.1396.618.38113
[]()44.0657.5732.1338.1875.1985.948.35Ediburgh-Mila-UCLA
[]()43.8459.2430.2467.7184.0195.3210.99bear
[]()42.7561.2126.4563.5184.295.997.63CHAIR
[]()41.6355.1229.7382.2177.492.2713.01MReaL
[]()41.0761.922.6968.6887.396.3917.93LSTM
[]()40.361.1821.8874.1186.1396.1440.44Academia Sinica
[]()37.0356.6119.7463.9685.1295.7628.4Fj
[]()36.7555.2420.4469.9384.1395.140.84Mycsulb
[]()31.2447.916.6654.0484.3184.3313.98LocalPrior
[]()28.942.9416.6251.6974.8188.8693.08GlobalPrior
[]()26.4545.699.4755.2350.9360.8111.49muc_ai
[]()17.8236.051.7462.434.8435.7819.99CNN